[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f2ab7c58-2d69-9042-4880-9b86bcdd4053@linux.intel.com>
Date: Mon, 16 May 2022 10:39:45 -0700
From: Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Kai Huang <kai.huang@...el.com>,
"Kirill A. Shutemov" <kirill@...temov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Dave Hansen <dave.hansen@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>, Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Wander Lairson Costa <wander@...hat.com>,
Isaku Yamahata <isaku.yamahata@...il.com>,
marcelo.cerri@...onical.com, tim.gardner@...onical.com,
khalid.elmously@...onical.com, philip.cox@...onical.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support
Hi Dave,
On 5/10/22 3:42 AM, Kai Huang wrote:
> On Tue, 2022-05-10 at 11:54 +1200, Kai Huang wrote:
>> On Mon, 2022-05-09 at 15:09 +0300, Kirill A. Shutemov wrote:
>>> On Mon, May 09, 2022 at 03:37:22PM +1200, Kai Huang wrote:
>>>> On Sat, 2022-05-07 at 03:42 +0300, Kirill A. Shutemov wrote:
>>>>> On Fri, May 06, 2022 at 12:11:03PM +1200, Kai Huang wrote:
>>>>>> Kirill, what's your opinion?
>>>>>
>>>>> I said before that I think DMA API is the right tool here.
>>>>>
>>>>> Speculation about future of DMA in TDX is irrelevant here. If semantics
>>>>> change we will need to re-evaluate all users. VirtIO uses DMA API and it
>>>>> is conceptually the same use-case: communicate with the host.
>>>>
>>>> Virtio is designed for device driver to use, so it's fine to use DMA API. And
>>>> real DMA can happen to the virtio DMA buffers. Attestation doesn't have such
>>>> assumption.
>>>
>>> Whether attestation driver uses struct device is implementation detail.
>>> I don't see what is you point.
>>
>> No real DMA is involved in attestation.
>>
>>>
>>>> So I don't see why TD guest kernel cannot have a simple protocol to vmap() a
>>>> page (or couple of pages) as shared on-demand, like below:
>>>>
>>>> page = alloc_page();
>>>>
>>>> addr = vmap(page, pgprot_decrypted(PAGE_KERNEL));
>>>>
>>>> clflush_cache_range(page_address(page), PAGE_SIZE);
>>>>
>>>> MapGPA(page_to_phys(page) | cc_mkdec(0), PAGE_SIZE);
>>>>
>>>> And we can even avoid above clflush_cache_range() if I understand correctly.
>>>>
>>>> Or I missed something?
>>>
>>> For completeness, cover free path too. Are you going to opencode page
>>> accept too?
>>
>> Call __tdx_module_call(TDX_ACCEPT_PAGE, ...) right after MapGPA() to convert
>> back to private. I don't think there is any problem?
>>
>>>
>>> Private->Shared conversion is destructive. You have to split SEPT, flush
>>> TLB. Backward conversion even more costly.
>>
>> I think I won't call it destructive.
>>
>> And I suggested before, we can allocate a default size buffer (i.e. 4 pages),
>> which is large enough to cover all requests for now, during driver
>> initialization. This avoids IOCTL time conversion. We should still have code
>> in the IOCTL to check the request buffer size and when it is larger than the
>> default, the old should be freed a larger one should be allocated. But for now
>> this code path will never happen.
>>
>> Btw above is based on assumption that we don't support concurrent IOCTLs. This
>> version Sathya somehow changed to support concurrent IOCTLs but this was a
>> surprise as I thought we somehow agreed we don't need to support this.
>
> Hi Dave,
>
> Sorry I forgot to mention that GHCI 1.5 defines a generic TDVMCALL<Service> for
> a TD to communicate with VMM or another TD or some service in the host. This
> TDVMCALL can support many sub-commands. For now only sub-commands for TD
> migration is defined, but we can have more.
>
> For this, we cannot assume the size of the command buffer, and I don't see why
> we don't want to support concurrent TDVMCALLs. So looks from long term, we will
> very likely need IOCTL time buffer private-shared conversion.
>
>
Let me summarize the discussion so far.
Problem: Allocate shared buffer without breaking the direct map.
Solution 1: Use alloc_pages*()/vmap()/set_memory_*crypted() APIs
Pros/Cons:
1. Uses virtual mapped address for shared/private conversion and
hence does not touch the direct mapping.
2. Current version of set_memory_*crypted() APIs modifies the
aliased mappings, which also includes the direct mapping. So if we
want to use set_memory_*() APIs, we need a new variant that does not
touch the direct mapping. An alternative solution is to open code the
page attribute conversion, cache flushing and MapGpa/Page acceptance
logic in the attestation driver itself. But, IMO, this is not
preferred because it is not favorable to sprinkle the mapping
conversion code in multiple places in the kernel. It is better to use
a single API if possible.
3. This solution can possibly break the SEPT entries on private-shared
conversion. The backward conversion is also costly. IMO, since the
attestation requests are not very frequent, we don't need to be
overly concerned about the cost involved in the conversion.
Solution 2: Use DMA alloc APIs.
Pros/Cons:
1. Simpler to use. It taps into the SWIOTLB buffers and does not
affect the direct map. Since we will be using already converted
memory, allocation/freeing will be cheaper compared to solution 1.
2. There is a concern that it is not a long term solution. Since
with advent of TDX IO, not all DMA allocations need to use
SWIOTLB model. But as per Kirill's comments, this is not a concern
and force_dma_unencrypted() hook can be used to differentiate which
devices need to use TDX IO vs SWIOTLB model.
3. Using DMA APIs requires a valid bus device as argument and hence
requires this driver converted into a platform device driver. But,
since this driver does not do real DMA, making above changes just
to use the DMA API is not recommended.
Since both solutions fix the problem (and there are pros/cons), and both
Kai/Kirill's comments conclusion is, there is no hard preference and
to let you decide on it.
Since you have already made a comment about "irrespective of which
model is chosen, you need the commit log talk about the solution and
how it not touches the direct map", I have posted the v6 version
adapting Solution 1.
Please let me know if you agree with this direction or have comments
about the solution.
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Powered by blists - more mailing lists