linux-kernel - Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <061ccd49-3b9f-d603-bafd-61a067c3f6fa@intel.com>
Date:   Fri, 12 Nov 2021 09:59:46 -0800
From:   Dave Hansen <dave.hansen@...el.com>
To:     Peter Gonda <pgonda@...gle.com>,
        Brijesh Singh <brijesh.singh@....com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        Tom Lendacky <Thomas.Lendacky@....com>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Michael Roth <michael.roth@....com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Andi Kleen <ak@...ux.intel.com>, tony.luck@...el.com,
        marcorr@...gle.com, sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP)
 Hypervisor Support

On 11/12/21 7:43 AM, Peter Gonda wrote:
...
> Here is an alternative to the current approach: On RMP violation (host
> or userspace) the page fault handler converts the page from private to
> shared to allow the write to continue. This pulls from s390’s error
> handling which does exactly this. See ‘arch_make_page_accessible()’.
> Additionally it adds less complexity to the SNP kernel patches, and
> requires no new ABI.

I think it's important to very carefully describe where these RMP page
faults can occur within the kernel.  They can obvious occur on things like:

	copy_to_user(&user_buf, &kernel_buf, len);

That's not a big deal.  Those can obviously handle page faults.  We know
exactly the instruction on which the fault can occur and we handle it
gracefully.

*But*, these are harder:

	get_user_pages(addr, len, &pages);
	spin_lock(&lock);
	ptr = kmap_atomic(pages[0]);
	*ptr = foo; // #PF here
	kunmap_atomic(ptr);
	spin_unlock(&lock);
	put_page(pages[0]);

In this case, the place where the fault happens are not especially well
bounded.  It can be in compiler-generated code.  It can happen on any
random instruction in there.

Or, is there some mechanism that prevent guest-private memory from being
accessed in random host kernel code?

> This proposal does cause guest memory corruption for some bugs but one
> of SEV-SNP’s goals extended from SEV-ES’s goals is for guest’s to be
> able to detect when its memory has been corrupted / replayed by the
> host. So SNP already has features for allowing guests to detect this
> kind of memory corruption. Additionally this is very similar to a page
> of memory generating a machine check because of 2-bit memory
> corruption. In other words SNP guests must be enlightened and ready
> for these kinds of errors.
> 
> For an SNP guest running under this proposal the flow would look like this:
> * Host gets a #PF because its trying to write to a private page.
> * Host #PF handler updates the page to shared.
> * Write continues normally.
> * Guest accesses memory (r/w).
> * Guest gets a #VC error because the page is not PVALIDATED
> * Guest is now in control. Guest can terminate because its memory has
> been corrupted. Guest could try and continue to log the error to its
> owner.

This sounds like a _possible_ opportunity for the guest to do some extra
handling.  It's also quite possible that this #VC happens in a place
that the guest can't handle.


> A similar approach was introduced in the SNP patches V1 and V2 for
> kernel page fault handling. The pushback around this convert to shared
> approach was largely focused around the idea that the kernel has all
> the information about which pages are shared vs private so it should
> be able to check shared status before write to pages. After V2 the
> patches were updated to not have a kernel page fault handler for RMP
> violations (other than dumping state during a panic). The current
> patches protect the host with new post_{map,unmap}_gfn() function that
> checks if a page is shared before mapping it, then locks the page
> shared until unmapped. Given the discussions on ‘[Part2,v5,39/45] KVM:
> SVM: Introduce ops for the post gfn map and unmap’ building a solution
> to do this is non trivial and adds new overheads to KVM. Additionally
> the current solution is local to the kernel. So a new ABI just now be
> created to allow the userspace VMM to access the kernel-side locks for
> this to work generically for the whole host. This is more complicated
> than this proposal and adding more lock holders seems like it could
> reduce performance further.

The locking is complicated.  But, I think it is necessary.  Once the
kernel can map private memory, it can access it in random spots.  It's
hard to make random kernel code recoverable from faults.