[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11937ca5-0dc2-d3a0-329b-236345d9f904@amd.com>
Date: Mon, 22 Nov 2021 12:01:36 -0600
From: Brijesh Singh <brijesh.singh@....com>
To: Vlastimil Babka <vbabka@...e.cz>, Peter Gonda <pgonda@...gle.com>
Cc: brijesh.singh@....com, x86@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
Tom Lendacky <Thomas.Lendacky@....com>,
"H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Sergio Lopez <slp@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dov Murik <dovmurik@...ux.ibm.com>,
Tobin Feldman-Fitzthum <tobin@....com>,
Borislav Petkov <bp@...en8.de>,
Michael Roth <michael.roth@....com>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Andi Kleen <ak@...ux.intel.com>, tony.luck@...el.com,
marcorr@...gle.com, sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP)
Hypervisor Support
On 11/22/21 11:03 AM, Vlastimil Babka wrote:
> On 11/22/21 16:23, Brijesh Singh wrote:
>> Hi Peter,
>>
>> On 11/12/21 9:43 AM, Peter Gonda wrote:
>>> Hi Brijesh,,
>>>
>>> One high level discussion I'd like to have on these SNP KVM patches.
>>>
>>> In these patches (V5) if a host userspace process writes a guest
>>> private page a SIGBUS is issued to that process. If the kernel writes
>>> a guest private page then the kernel panics due to the unhandled RMP
>>> fault page fault. This is an issue because not all writes into guest
>>> memory may come from a bug in the host. For instance a malicious or
>>> even buggy guest could easily point the host to writing a private page
>>> during the emulation of many virtual devices (virtio, NVMe, etc). For
>>> example if a well behaved guests behavior is to: start up a driver,
>>> select some pages to share with the guest, ask the host to convert
>>> them to shared, then use those pages for virtual device DMA, if a
>>> buggy guest forget the step to request the pages be converted to
>>> shared its easy to see how the host could rightfully write to private
>>> memory. I think we can better guarantee host reliability when running
>>> SNP guests without changing SNP’s security properties.
>>>
>>> Here is an alternative to the current approach: On RMP violation (host
>>> or userspace) the page fault handler converts the page from private to
>>> shared to allow the write to continue. This pulls from s390’s error
>>> handling which does exactly this. See ‘arch_make_page_accessible()’.
>>> Additionally it adds less complexity to the SNP kernel patches, and
>>> requires no new ABI.
>>>
>>> In the current (V5) KVM implementation if a userspace process
>>> generates an RMP violation (writes to guest private memory) the
>>> process receives a SIGBUS. At first glance, it would appear that
>>> user-space shouldn’t write to private memory. However, guaranteeing
>>> this in a generic fashion requires locking the RMP entries (via locks
>>> external to the RMP). Otherwise, a user-space process emulating a
>>> guest device IO may be vulnerable to having the guest memory
>>> (maliciously or by guest bug) converted to private while user-space
>>> emulation is happening. This results in a well behaved userspace
>>> process receiving a SIGBUS.
>>>
>>> This proposal allows buggy and malicious guests to run under SNP
>>> without jeopardizing the reliability / safety of host processes. This
>>> is very important to a cloud service provider (CSP) since it’s common
>>> to have host wide daemons that write/read all guests, i.e. a single
>>> process could manage the networking for all VMs on the host. Crashing
>>> that singleton process kills networking for all VMs on the system.
>>>
>> Thank you for starting the thread; based on the discussion, I am keeping the
>> current implementation as-is and *not* going with the auto conversion from
>> private to shared. To summarize what we are doing in the current SNP series:
>>
>> - If userspace accesses guest private memory, it gets SIGBUS.
>
> So, is there anything protecting host userspace processes from malicious guests?
>
Unfortunately, no.
In the future, we could look into Sean's suggestion to come with an ABI
that userspace can use to lock the guest pages before the access and
notify the caller of the access violation. It seems that TDX may need
something similar, but I cannot tell for sure. This proposal seems good
at the first glance but devil is in the detail; once implemented we also
need to measure the performance implication of it.
Should we consider using SIGSEGV (SEGV_ACCERR) instead of SIGBUS? In
other words, treating a guest's private pages as read-only and writing
to them will generate a standard SIGSEGV.
thanks
>> - If kernel accesses[*] guest private memory, it does panic.
>>
>> [*] Kernel consults the RMP table for the page ownership before the access.
>> If the page is shared, then it uses the locking mechanism to ensure that a
>> guest will not be able to change the page ownership while kernel has it mapped.
>>
>> thanks
>>
Powered by blists - more mailing lists