[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23dbe0da-581e-2444-7126-428e79514614@amd.com>
Date: Thu, 8 Jul 2021 11:48:26 -0500
From: Brijesh Singh <brijesh.singh@....com>
To: Dave Hansen <dave.hansen@...el.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org
Cc: brijesh.singh@....com, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
Tom Lendacky <thomas.lendacky@....com>,
"H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dov Murik <dovmurik@...ux.ibm.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Borislav Petkov <bp@...en8.de>,
Michael Roth <michael.roth@....com>,
Vlastimil Babka <vbabka@...e.cz>, tony.luck@...el.com,
npmccallum@...hat.com, brijesh.ksingh@...il.com
Subject: Re: [PATCH Part2 RFC v4 09/40] x86/fault: Add support to dump RMP
entry on fault
On 7/8/21 10:30 AM, Dave Hansen wrote:
> I was even thinking that you could use the pmd/pte entries that come
> from the walk in dump_pagetable().
>
> BTW, I think the snp_lookup_page_in_rmptable() interface is probably
> wrong. It takes a 'struct page':
>
In some cases the caller already have 'struct page' so it was easier on
them. I can change it to snp_lookup_pfn_in_rmptable() to simplify the
things. In the cases where the caller already have 'struct page' will
simply do page_to_pfn().
> +struct rmpentry *snp_lookup_page_in_rmptable(struct page *page, int *level)
>
> but then immediately converts it to a paddr:
>
>> + unsigned long phys = page_to_pfn(page) << PAGE_SHIFT;
>
> If you just had it take a paddr, you wouldn't have to mess with all of
> this pfn_valid() and phys_to_page() error checking.
Noted.
>
> Or fix the snp_lookup_page_in_rmptable() interface, please.
Yes.
>
> Let's capitalize "RMP" here, please.
Noted.
>
>> PGD 304b201067 P4D 304b201067 PUD 20c7f06063 PMD 20c8976063 PTE
>> 80000020cee00163
>> RMPEntry paddr 0x20cee00000 [assigned=1 immutable=1 pagesize=0 gpa=0x0
^^^^^^^^^^
>> asid=0 vmsa=0 validated=0]
>> RMPEntry paddr 0x20cee00000 000000000000000f 8000000000000ffd
>
> That's a good example, thanks!
>
> But, it does make me think that we shouldn't be spitting out
> "immutable". Should we call it "readonly" or something so that folks
> have a better chance of figuring out what's wrong? Even better, should
> we be looking specifically for X86_PF_RMP *and* immutable=1 and spitting
> out something in english about it?
>
A write to an assigned page will cause the RMP violation. In this case,
the page happen to be firmware page hence the immutable bit was also
set. I am trying to use the field name as documented in the APM and
SEV-SNP firmware spec.
> This also *looks* to be spitting out the same "RMPEntry paddr
> 0x20cee00000" more than once. Maybe we should just indent the extra
> entries instead of repeating things. The high/low are missing a "0x"
> prefix, they also don't have any kind of text label.
>
Noted, I will fix it.
>
> I actually really like processing the fields. I think it's a good
> investment to make the error messages as self-documenting as possible
> and not require the poor souls who are decoding oopses to also keep each
> vendor's architecture manuals at hand.
>
Sounds good, I will keep it as-is.
>>
>> The reason for iterating through 2MB region is; if the faulting address
>> is not assigned in the RMP table, and page table walk level is 2MB then
>> one of entry within the large page is the root cause of the fault. Since
>> we don't know which entry hence I dump all the non-zero entries.
>
> Logically you can figure this out though, right? Why throw 511 entries
> at the console when we *know* they're useless?
Logically its going to be tricky to figure out which exact entry caused
the fault, hence I dump any non-zero entry. I understand it may dump
some useless.
>> There are two cases which we need to consider:
>>
>> 1) the faulting page is a guest private (aka assigned)
>> 2) the faulting page is a hypervisor (aka shared)
>>
>> We will be primarily seeing #1. In this case, we know its a assigned
>> page, and we can decode the fields.
>>
>> The #2 will happen in rare conditions,
>
> What rare conditions?
>
One such condition is RMP "in-use" bit is set; see the patch 20/40.
After applying the patch we should not see "in-use" bit set. If we run
into similar issues, a full RMP dump will greatly help debug.
>> if it happens, one of the undocumented bit in the RMP entry can
>> provide us some useful information hence we dump the raw values.
> You're saying that there are things that can cause RMP faults that
> aren't documented? That's rather nasty for your users, don't you think?
>
The "in-use" bit in the RMP entry caught me off guard. The AMD APM does
says that hardware sets in-use bit but it *never* explained in the
detail on how to check if the fault was due to in-use bit in the RMP
table. As I said, the documentation folks will be updating the RMP entry
to document the in-use bit. I hope we will not see any other
undocumented surprises, I am keeping my finger cross :)
> I'd be fine if you want to define a mask of unknown bits and spit out to
> the users that some unknown bits are set.
>
Powered by blists - more mailing lists