linux-kernel - Re: [PATCH Part2 RFC v4 09/40] x86/fault: Add support to dump RMP entry on fault

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23dbe0da-581e-2444-7126-428e79514614@amd.com>
Date:   Thu, 8 Jul 2021 11:48:26 -0500
From:   Brijesh Singh <brijesh.singh@....com>
To:     Dave Hansen <dave.hansen@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org
Cc:     brijesh.singh@....com, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        Tom Lendacky <thomas.lendacky@....com>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Michael Roth <michael.roth@....com>,
        Vlastimil Babka <vbabka@...e.cz>, tony.luck@...el.com,
        npmccallum@...hat.com, brijesh.ksingh@...il.com
Subject: Re: [PATCH Part2 RFC v4 09/40] x86/fault: Add support to dump RMP
 entry on fault



On 7/8/21 10:30 AM, Dave Hansen wrote:
> I was even thinking that you could use the pmd/pte entries that come
> from the walk in dump_pagetable().
> 
> BTW, I think the snp_lookup_page_in_rmptable() interface is probably
> wrong.  It takes a 'struct page':
> 

In some cases the caller already have 'struct page' so it was easier on 
them. I can change it to snp_lookup_pfn_in_rmptable() to simplify the 
things. In the cases where the caller already have 'struct page' will 
simply do page_to_pfn().


> +struct rmpentry *snp_lookup_page_in_rmptable(struct page *page, int *level)
> 
> but then immediately converts it to a paddr:
> 
>> +	unsigned long phys = page_to_pfn(page) << PAGE_SHIFT;
> 
> If you just had it take a paddr, you wouldn't have to mess with all of
> this pfn_valid() and phys_to_page() error checking.

Noted.

> 
> Or fix the snp_lookup_page_in_rmptable() interface, please.

Yes.

> 
> Let's capitalize "RMP" here, please.

Noted.

> 
>> PGD 304b201067 P4D 304b201067 PUD 20c7f06063 PMD 20c8976063 PTE
>> 80000020cee00163
>> RMPEntry paddr 0x20cee00000 [assigned=1 immutable=1 pagesize=0 gpa=0x0
                                 ^^^^^^^^^^

>> asid=0 vmsa=0 validated=0]
>> RMPEntry paddr 0x20cee00000 000000000000000f 8000000000000ffd
> 
> That's a good example, thanks!
> 
> But, it does make me think that we shouldn't be spitting out
> "immutable".  Should we call it "readonly" or something so that folks
> have a better chance of figuring out what's wrong?  Even better, should
> we be looking specifically for X86_PF_RMP *and* immutable=1 and spitting
> out something in english about it?
> 

A write to an assigned page will cause the RMP violation. In this case, 
the page happen to be firmware page hence the immutable bit was also 
set. I am trying to use the field name as documented in the APM and 
SEV-SNP firmware spec.


> This also *looks* to be spitting out the same "RMPEntry paddr
> 0x20cee00000" more than once.  Maybe we should just indent the extra
> entries instead of repeating things.  The high/low are missing a "0x"
> prefix, they also don't have any kind of text label.
> 
Noted, I will fix it.

> 
> I actually really like processing the fields.  I think it's a good
> investment to make the error messages as self-documenting as possible
> and not require the poor souls who are decoding oopses to also keep each
> vendor's architecture manuals at hand.
> 
Sounds good, I will keep it as-is.


>>
>> The reason for iterating through 2MB region is; if the faulting address
>> is not assigned in the RMP table, and page table walk level is 2MB then
>> one of entry within the large page is the root cause of the fault. Since
>> we don't know which entry hence I dump all the non-zero entries.
> 
> Logically you can figure this out though, right?  Why throw 511 entries
> at the console when we *know* they're useless?

Logically its going to be tricky to figure out which exact entry caused 
the fault, hence I dump any non-zero entry. I understand it may dump 
some useless.


>> There are two cases which we need to consider:
>>
>> 1) the faulting page is a guest private (aka assigned)
>> 2) the faulting page is a hypervisor (aka shared)
>>
>> We will be primarily seeing #1. In this case, we know its a assigned
>> page, and we can decode the fields.
>>
>> The #2 will happen in rare conditions,
> 
> What rare conditions?
> 

One such condition is RMP "in-use" bit is set; see the patch 20/40. 
After applying the patch we should not see "in-use" bit set. If we run 
into similar issues, a full RMP dump will greatly help debug.


>> if it happens, one of the undocumented bit in the RMP entry can
>> provide us some useful information hence we dump the raw values.
> You're saying that there are things that can cause RMP faults that
> aren't documented?  That's rather nasty for your users, don't you think?
> 

The "in-use" bit in the RMP entry caught me off guard. The AMD APM does 
says that hardware sets in-use bit but it *never* explained in the 
detail on how to check if the fault was due to in-use bit in the RMP 
table. As I said, the documentation folks will be updating the RMP entry 
to document the in-use bit. I hope we will not see any other 
undocumented surprises, I am keeping my finger cross :)


> I'd be fine if you want to define a mask of unknown bits and spit out to
> the users that some unknown bits are set.
>