linux-kernel - Re: [PATCH v8 5/5] x86/tdx: Add Quote generation support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d3808510-9974-258e-0c7b-9a76e0868d48@intel.com>
Date:   Thu, 21 Jul 2022 09:08:11 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc:     "H . Peter Anvin" <hpa@...or.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Tony Luck <tony.luck@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kai Huang <kai.huang@...el.com>,
        Wander Lairson Costa <wander@...hat.com>,
        Isaku Yamahata <isaku.yamahata@...il.com>,
        marcelo.cerri@...onical.com, tim.gardner@...onical.com,
        khalid.elmously@...onical.com, philip.cox@...onical.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 5/5] x86/tdx: Add Quote generation support

On 6/8/22 19:52, Kuppuswamy Sathyanarayanan wrote:
> For shared buffer allocation, alternatives like using the DMA API is
> also considered. Although it simpler to use, it is not preferred because
> dma_alloc_*() APIs require a valid bus device as argument, which would
> need converting the attestation driver into a platform device driver.
> This is unnecessary, and since the attestation driver does not do real
> DMA, there is no need to use real DMA APIs.

Let's actually try to walk through the requirements for the memory
allocation here.

 1. The guest kernel needs to allocate some guest physical memory
    for the attestation data buffer
 2. The guest physical memory must be mapped by the guest so that
    it can be read/written.
 3. The guest mapping must be a "TDX Shared" mapping.  Since all
    guest physical memory is "TDX Private" by default, something
    must convert the memory from Private->Shared.
 4. If there are alias mappings with "TDX Private" page table
    permissions, those mappings must never be used while the page is
    in its shared state.
    4a. load_unaligned_zeropad() must be prevented from being used
	on the page immediately preceding a Private alias to a Shared
 	page.
 5. Actions that increasingly fracture the direct map must be avoided.
    Attestation may happen many times and repeated allocations that
    fracture the direct map have performance consequences.
 6. A softer requirement: presuming that bounce buffers won't be used
    for TDX devices *forever*, it would be nice to use a mechanism that
    will continue to work on systems that don't have swiotlb on.

I think we've talked about three different solutions:

== vmalloc() ==

So, let's say we used a relatively plain vmalloc().  That's great for
#1->#3 as long as the vmalloc() mapping gets the "TDX Shared" bit set
properly on its PTEs.  But, it falls over for *either* #4 or #5.  If it
leaves the direct map alone, it's exposed to load_unaligned_zeropad().
If it unmaps the memory from the direct map, it runs afoul of #5.

== order-1 + vmap() ==

Let's now consider a vmalloc() variant: allocate a bunch of order-1
pages and vmap() page[1], leaving page[0] as a guard page against
load_unaligned_zeropad() on the direct map.  That works, but it's an
annoying amount of code.

== swiotlb pages ==

Using the swiotlb bounce buffer pages is the other proposed option.
They already have a working kernel mapping and have already been
converted.  They are mitigated against load_unaligned_zeropad().  They
do cause direct map fracturing, but only once since they're allocated
statically.  They don't increasingly degrade things.  It's a one-time
cost.  Their interaction with #6 is not great.

Did I miss anything?  Does that accurately capture where we are?