linux-kernel - Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f2ab7c58-2d69-9042-4880-9b86bcdd4053@linux.intel.com>
Date:   Mon, 16 May 2022 10:39:45 -0700
From:   Sathyanarayanan Kuppuswamy 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Kai Huang <kai.huang@...el.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H . Peter Anvin" <hpa@...or.com>, Tony Luck <tony.luck@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Wander Lairson Costa <wander@...hat.com>,
        Isaku Yamahata <isaku.yamahata@...il.com>,
        marcelo.cerri@...onical.com, tim.gardner@...onical.com,
        khalid.elmously@...onical.com, philip.cox@...onical.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

Hi Dave,

On 5/10/22 3:42 AM, Kai Huang wrote:
> On Tue, 2022-05-10 at 11:54 +1200, Kai Huang wrote:
>> On Mon, 2022-05-09 at 15:09 +0300, Kirill A. Shutemov wrote:
>>> On Mon, May 09, 2022 at 03:37:22PM +1200, Kai Huang wrote:
>>>> On Sat, 2022-05-07 at 03:42 +0300, Kirill A. Shutemov wrote:
>>>>> On Fri, May 06, 2022 at 12:11:03PM +1200, Kai Huang wrote:
>>>>>> Kirill, what's your opinion?
>>>>>
>>>>> I said before that I think DMA API is the right tool here.
>>>>>
>>>>> Speculation about future of DMA in TDX is irrelevant here. If semantics
>>>>> change we will need to re-evaluate all users. VirtIO uses DMA API and it
>>>>> is conceptually the same use-case: communicate with the host.
>>>>
>>>> Virtio is designed for device driver to use, so it's fine to use DMA API. And
>>>> real DMA can happen to the virtio DMA buffers.  Attestation doesn't have such
>>>> assumption.
>>>
>>> Whether attestation driver uses struct device is implementation detail.
>>> I don't see what is you point.
>>
>> No real DMA is involved in attestation.
>>
>>>
>>>> So I don't see why TD guest kernel cannot have a simple protocol to vmap() a
>>>> page (or couple of pages) as shared on-demand, like below:
>>>>
>>>> 	page = alloc_page();
>>>>
>>>> 	addr = vmap(page,  pgprot_decrypted(PAGE_KERNEL));
>>>>
>>>> 	clflush_cache_range(page_address(page), PAGE_SIZE);
>>>>
>>>> 	MapGPA(page_to_phys(page) | cc_mkdec(0), PAGE_SIZE);
>>>>
>>>> And we can even avoid above clflush_cache_range() if I understand correctly.
>>>>
>>>> Or  I missed something?
>>>
>>> For completeness, cover free path too. Are you going to opencode page
>>> accept too?
>>
>> Call __tdx_module_call(TDX_ACCEPT_PAGE, ...) right after MapGPA() to convert
>> back to private.  I don't think there is any problem?
>>
>>>
>>> Private->Shared conversion is destructive. You have to split SEPT, flush
>>> TLB. Backward conversion even more costly.
>>
>> I think I won't call it destructive.
>>
>> And I suggested before, we can allocate a default size buffer (i.e. 4 pages),
>> which is large enough to cover all requests for now, during driver
>> initialization.  This avoids IOCTL time conversion.  We should still have code
>> in the IOCTL to check the request buffer size and when it is larger than the
>> default, the old should be freed a larger one should be allocated.  But for now
>> this code path will never happen.
>>
>> Btw above is based on assumption that we don't support concurrent IOCTLs.  This
>> version Sathya somehow changed to support concurrent IOCTLs but this was a
>> surprise as I thought we somehow agreed we don't need to support this.
> 
> Hi Dave,
> 
> Sorry I forgot to mention that GHCI 1.5 defines a generic TDVMCALL<Service> for
> a TD to communicate with VMM or another TD or some service in the host.  This
> TDVMCALL can support many sub-commands.  For now only sub-commands for TD
> migration is defined, but we can have more.
> 
> For this, we cannot assume the size of the command buffer, and I don't see why
> we don't want to support concurrent TDVMCALLs.  So looks from long term, we will
> very likely need IOCTL time buffer private-shared conversion.
> 
> 


Let me summarize the discussion so far.

Problem: Allocate shared buffer without breaking the direct map.

Solution 1: Use alloc_pages*()/vmap()/set_memory_*crypted() APIs

Pros/Cons:

1. Uses virtual mapped address for shared/private conversion and
    hence does not touch the direct mapping.

2. Current version of set_memory_*crypted() APIs  modifies the
    aliased mappings, which also includes the direct mapping. So if we
    want to use set_memory_*() APIs, we need a new variant that does not
    touch the direct mapping. An alternative solution is to open code the
    page attribute conversion, cache flushing and MapGpa/Page acceptance
    logic in the attestation driver itself. But, IMO, this is not
    preferred because it is not favorable to sprinkle the mapping
    conversion code in multiple places in the kernel. It is better to use
    a single API if possible.

3. This solution can possibly break the SEPT entries on private-shared
    conversion. The backward conversion is also costly. IMO, since the
    attestation requests are not very frequent, we don't need to be
    overly concerned about the cost involved in the conversion.

Solution 2: Use DMA alloc APIs.

Pros/Cons:

1. Simpler to use. It taps into the SWIOTLB buffers and does not
    affect the direct map. Since we will be using already converted
    memory, allocation/freeing will be cheaper compared to solution 1.

2. There is a concern that it is not a long term solution. Since
    with advent of TDX IO, not all DMA allocations need to use
    SWIOTLB model. But as per Kirill's comments, this is not a concern
    and force_dma_unencrypted() hook can be used to differentiate which
    devices need to use TDX IO vs SWIOTLB model.

3. Using DMA APIs requires a valid bus device as argument and hence
    requires this driver converted into a platform device driver. But,
    since this driver does not do real DMA, making above changes just
    to use the DMA API is not recommended.

Since both solutions fix the problem (and there are pros/cons), and both
Kai/Kirill's comments conclusion is, there is no hard preference and
to let you decide on it.

Since you have already made a comment about "irrespective of which
model is chosen, you need the commit log talk about the solution and
how it not touches the direct map", I have posted the v6 version
adapting Solution 1.

Please let me know if you agree with this direction or have comments
about the solution.

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer