[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b4accb6-b68e-02d3-6fed-975f90558099@amd.com>
Date: Mon, 12 Jul 2021 10:43:50 -0500
From: Brijesh Singh <brijesh.singh@....com>
To: Dave Hansen <dave.hansen@...el.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org
Cc: brijesh.singh@....com, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
Tom Lendacky <thomas.lendacky@....com>,
"H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dov Murik <dovmurik@...ux.ibm.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Borislav Petkov <bp@...en8.de>,
Michael Roth <michael.roth@....com>,
Vlastimil Babka <vbabka@...e.cz>, tony.luck@...el.com,
npmccallum@...hat.com, brijesh.ksingh@...il.com
Subject: Re: [PATCH Part2 RFC v4 10/40] x86/fault: Add support to handle the
RMP fault for user address
Hi Dave,
On 7/8/21 11:16 AM, Dave Hansen wrote:
>
> "SIGBUG"?
Its typo, it should be SIGBUS
>> +
>> + if (unlikely(!cpu_feature_enabled(X86_FEATURE_SEV_SNP)))
>> + return RMP_FAULT_KILL;
>
> Shouldn't this be a WARN_ON_ONCE()? How can we get RMP faults without
> SEV-SNP?
Yes, we should *not* get RMP fault if SEV-SNP is not enabled. I can use
the WARN_ON_ONCE().
>
>> + /* Get the native page level */
>> + pte = lookup_address_in_mm(current->mm, address, &level);
>> + if (unlikely(!pte))
>> + return RMP_FAULT_KILL;
>
> What would this mean? There was an RMP fault on a non-present page?
> How could that happen? What if there was a race between an unmapping
> event and the RMP fault delivery?
We should not have RMP fault for non-present pages. But you have a good
point that there maybe a race between the unmap event and RMP fault.
Instead of terminating the process we should simply retry.
>
>> + pfn = pte_pfn(*pte);
>> + if (level > PG_LEVEL_4K) {
>> + mask = pages_per_hpage(level) - pages_per_hpage(level - 1);
>> + pfn |= (address >> PAGE_SHIFT) & mask;
>> + }
>
> This looks inherently racy. What happens if there are two parallel RMP
> faults on the same 2M page. One of them splits the page tables, the
> other gets a fault for an already-split page table.
> > Is that handled here somehow?
Yes, in this particular case we simply retry and hardware should
re-evaluate the page level and take the corrective action.
>
>> + /* Get the page level from the RMP entry. */
>> + e = snp_lookup_page_in_rmptable(pfn_to_page(pfn), &rmp_level);
>> + if (!e)
>> + return RMP_FAULT_KILL;
>
> The snp_lookup_page_in_rmptable() failure cases looks WARN-worthly.
> Either you're doing a lookup for something not *IN* the RMP table, or
> you don't support SEV-SNP, in which case you shouldn't be in this code
> in the first place.
Noted.
>
>> + /*
>> + * Check if the RMP violation is due to the guest private page access.
>> + * We can not resolve this RMP fault, ask to kill the guest.
>> + */
>> + if (rmpentry_assigned(e))
>> + return RMP_FAULT_KILL;
>
> No "We's", please. Speak in imperative voice.
Noted.
>
>> + /*
>> + * The backing page level is higher than the RMP page level, request
>> + * to split the page.
>> + */
>> + if (level > rmp_level)
>> + return RMP_FAULT_PAGE_SPLIT;
>
> This can theoretically trigger on a hugetlbfs page. Right?
>
Yes, theoretically.
In the current implementation, the VMM is enlightened to not use the
hugetlbfs for backing page when creating the SEV-SNP guests.
> I thought I asked about this before... more below...
>
>> + return RMP_FAULT_RETRY;
>> +}
>> +
>> /*
>> * Handle faults in the user portion of the address space. Nothing in here
>> * should check X86_PF_USER without a specific justification: for almost
>> @@ -1298,6 +1350,7 @@ void do_user_addr_fault(struct pt_regs *regs,
>> struct task_struct *tsk;
>> struct mm_struct *mm;
>> vm_fault_t fault;
>> + int ret;
>> unsigned int flags = FAULT_FLAG_DEFAULT;
>>
>> tsk = current;
>> @@ -1378,6 +1431,22 @@ void
> (struct pt_regs *regs,
>> if (error_code & X86_PF_INSTR)
>> flags |= FAULT_FLAG_INSTRUCTION;
>>
>> + /*
>> + * If its an RMP violation, try resolving it.
>> + */
>> + if (error_code & X86_PF_RMP) {
>> + ret = handle_user_rmp_page_fault(error_code, address);
>> + if (ret == RMP_FAULT_PAGE_SPLIT) {
>> + flags |= FAULT_FLAG_PAGE_SPLIT;
>> + } else if (ret == RMP_FAULT_KILL) {
>> + fault |= VM_FAULT_SIGBUS;
>> + do_sigbus(regs, error_code, address, fault);
>> + return;
>> + } else {
>> + return;
>> + }
>> + }
>
> Why not just have handle_user_rmp_page_fault() return a VM_FAULT_* code
> directly?
>
I don't have any strong reason against it. In next rev, I can update to
use the VM_FAULT_* code and call the do_sigbus() etc.
> I also suspect you can just set VM_FAULT_SIGBUS and let the do_sigbus()
> call later on in the function do its work.
>>
>> +static int handle_split_page_fault(struct vm_fault *vmf)
>> +{
>> + if (!IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
>> + return VM_FAULT_SIGBUS;
>> +
>> + __split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
>> + return 0;
>> +}
>
> What will this do when you hand it a hugetlbfs page?
>
VMM is updated to not use the hugetlbfs when creating SEV-SNP guests.
So, we should not run into it.
-Brijesh
Powered by blists - more mailing lists