[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c07c9fa1-ff6a-df8a-6050-60ade29367f3@linux.intel.com>
Date: Fri, 13 May 2022 12:29:55 -0700
From: Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Isaku Yamahata <isaku.yamahata@...il.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Kai Huang <kai.huang@...el.com>,
Wander Lairson Costa <wander@...hat.com>,
marcelo.cerri@...onical.com, tim.gardner@...onical.com,
khalid.elmously@...onical.com, philip.cox@...onical.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 5/5] x86/tdx: Add Quote generation support
On 5/13/22 11:58 AM, Isaku Yamahata wrote:
> On Thu, May 12, 2022 at 03:19:52PM -0700,
> Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
>
>> In TDX guest, the second stage in attestation process is to send the
>> TDREPORT to QE/QGS to generate the TD Quote. For platforms that does
>> not support communication channels like vsock or TCP/IP, implement
>> support to get TD Quote using hypercall. GetQuote hypercall can be used
>> by the TD guest to request VMM facilitate the Quote generation via
>> QE/QGS. More details about GetQuote hypercall can be found in TDX
>> Guest-Host Communication Interface (GHCI) for Intel TDX 1.0, section
>> titled "TDG.VP.VMCALL<GetQuote>.
>>
>> Since GetQuote is an asynchronous request hypercall, it will not block
>> till the TD Quote is generated. So VMM uses callback interrupt vector
>> configured by SetupEventNotifyInterrupt hypercall to notify the guest
>> about Quote generation completion or failure.
>>
>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
>> with TDREPORT data as input, which is further used by the VMM to copy
>> the TD Quote result after successful Quote generation. To create the
>> shared buffer without breaking the direct map, allocate physically
>> contiguous kernel memory and create a virtual mapping for it using
>> vmap(). set_memory_*crypted_noalias() functions can be used to share or
>> unshare the vmapped page without affecting the direct map.
>>
>> Also note that, shared buffer allocation is currently handled in IOCTL
>> handler, although it will increase the TDX_CMD_GET_QUOTE IOCTL response
>> time, it is negligible compared to the time required for the quote
>> generation completion. So IOCTL performance optimization is not
>> considered at this time.
>>
>> For shared buffer allocation, alternatives like using the DMA API is
>> also considered. Although it simpler to use, it is not preferred because
>> dma_alloc_*() APIs require a valid bus device as argument, which would
>> need converting the attestation driver into a platform device driver.
>> This is unnecessary, and since the attestation driver does not do real
>> DMA, there is no need to use real DMA APIs.
>>
>> Add support for TDX_CMD_GET_QUOTE IOCTL to allow attestation agent
>> submit GetQuote requests from the user space. Since Quote generation
>> is an asynchronous request, IOCTL will block indefinitely for the VMM
>> response in wait_for_completion_interruptible() call. Using this call
>> will also add an option for the user to end the current request
>> prematurely by raising any signals. This can be used by attestation
>> agent to implement Quote generation timeout feature. If attestation
>> agent is aware of time it can validly wait for QE/QGS response, then
>> a possible timeout support can be implemented in the user application
>> using signals. Quote generation timeout feature is currently not
>> implemented in the driver because the current TDX specification does
>> not have any recommendation for it.
>>
>> After submitting the GetQuote request using hypercall, the shared buffer
>> allocated for the current request is owned by the VMM. So, during this
>> wait window, if the user terminates the request by raising a signal or
>> by terminating the application, add a logic to do the memory cleanup
>> after receiving the VMM response at a later time. Such memory cleanup
>> support requires accepting the page again using TDX_ACCEPT_PAGE TDX
>> Module call. So to not overload the callback IRQ handler, move the
>> callback handler logic to a separate work queue.
>>
>> To support parallel GetQuote requests, use linked list to track the
>> active GetQuote requests and upon receiving the callback IRQ, loop
>> through the active requests and mark the processed requests complete.
>> Users can open multiple instances of the attestation device and send
>> GetQuote requests in parallel.
>>
>> Reviewed-by: Tony Luck <tony.luck@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>> arch/x86/coco/tdx/attest.c | 293 ++++++++++++++++++++++++++++++++
>> arch/x86/include/uapi/asm/tdx.h | 45 +++++
>> 2 files changed, 338 insertions(+)
>>
>> diff --git a/arch/x86/coco/tdx/attest.c b/arch/x86/coco/tdx/attest.c
>> index a5f4111f9b18..5531a1834f8c 100644
>> --- a/arch/x86/coco/tdx/attest.c
>> +++ b/arch/x86/coco/tdx/attest.c
>> @@ -13,16 +13,51 @@
>> #include <linux/miscdevice.h>
>> #include <linux/mm.h>
>> #include <linux/io.h>
>> +#include <linux/set_memory.h>
>> +#include <linux/mutex.h>
>> #include <asm/tdx.h>
>> +#include <asm/coco.h>
>> #include <uapi/asm/tdx.h>
>>
>> #define DRIVER_NAME "tdx-attest"
>>
>> /* TDREPORT module call leaf ID */
>> #define TDX_GET_REPORT 4
>> +/* GetQuote hypercall leaf ID */
>> +#define TDVMCALL_GET_QUOTE 0x10002
>> +
>> +/* Used for buffer allocation in GetQuote request */
>> +struct quote_buf {
>> + void *vmaddr;
>> + int count;
>> +};
>> +
>> +/* List entry of quote_list*/
>> +struct quote_entry {
>> + bool valid;
>> + struct quote_buf *buf;
>> + struct list_head list;
>> + struct completion compl;
>> +};
>>
>> static struct miscdevice miscdev;
>>
>> +/*
>> + * To support parallel GetQuote requests, use the list
>> + * to track active GetQuote requests.
>> + */
>> +static LIST_HEAD(quote_list);
>> +
>> +/* Lock to protect quote_list */
>> +static DEFINE_MUTEX(quote_lock);
>> +
>> +/*
>> + * Workqueue to handle Quote data after Quote generation
>> + * notification from VMM.
>> + */
>> +struct workqueue_struct *quote_wq;
>> +struct work_struct quote_work;
>> +
>> static long tdx_get_report(void __user *argp)
>> {
>> void *reportdata = NULL, *tdreport = NULL;
>> @@ -71,6 +106,254 @@ static long tdx_get_report(void __user *argp)
>> return ret;
>> }
>>
>> +/* tdx_get_quote_hypercall() - Request to get TD Quote using TDREPORT */
>> +static long tdx_get_quote_hypercall(struct quote_buf *buf)
>> +{
>> + struct tdx_hypercall_args args = {0};
>> +
>> + args.r10 = TDX_HYPERCALL_STANDARD;
>> + args.r11 = TDVMCALL_GET_QUOTE;
>> + args.r12 = cc_mkdec(page_to_phys(vmalloc_to_page(buf->vmaddr)));
>> + args.r13 = buf->count * PAGE_SIZE;
>> +
>> + /*
>> + * Pass the physical address of TDREPORT to the VMM and
>> + * trigger the Quote generation. It is not a blocking
>> + * call, hence completion of this request will be notified to
>> + * the TD guest via a callback interrupt. More info about ABI
>> + * can be found in TDX Guest-Host-Communication Interface
>> + * (GHCI), sec titled "TDG.VP.VMCALL<GetQuote>".
>> + */
>> + return __tdx_hypercall(&args, 0);
>> +}
>> +
>> +/*
>> + * alloc_quote_buf() - Used to allocate a shared buffer of
>> + * given size.
>> + *
>> + * Size is page aligned and the allocated memory is decrypted
>> + * to allow VMM access to it. Uses VMAP to create a shared mapping
>> + * for the buffer to not affect the direct map.
>> + */
>> +static struct quote_buf *alloc_quote_buf(u64 req_size)
>> +{
>> + int size = PAGE_ALIGN(req_size);
>> + void *addr = NULL, *vmaddr = NULL;
>> + int count = size >> PAGE_SHIFT;
>> + struct page **pages = NULL;
>> + struct quote_buf *buf;
>> + int i;
>> +
>> + buf = kmalloc(sizeof(*buf), GFP_KERNEL);
>> + if (!buf)
>> + return NULL;
>> +
>> + addr = alloc_pages_exact(size, GFP_KERNEL);
>> + if (!addr)
>> + goto alloc_failed;
>> +
>> + /* Allocate mem for array of page ptrs */
>> + pages = kcalloc(count, sizeof(*pages), GFP_KERNEL);
>> + if (!pages)
>> + goto alloc_failed;
>> +
>> + for (i = 0; i < count; i++)
>> + pages[i] = virt_to_page(addr + i * PAGE_SIZE);
>> +
>> + /*
>> + * Use VMAP to create shared mapping without affecting
>> + * the direct map. Use VM_MAP_PUT_PAGES to allow vmap()
>> + * responsible for freeing the pages when using vfree().
>> + */
>> + vmaddr = vmap(pages, count, VM_MAP_PUT_PAGES, PAGE_KERNEL);
>> + if (!vmaddr)
>> + goto alloc_failed;
>> +
>> + /* Use noalias variant to not affect the direct mapping */
>> + if (set_memory_decrypted_noalias((unsigned long)vmaddr, count))
>> + goto alloc_failed;
>> +
>> + buf->vmaddr = vmaddr;
>> + buf->count = count;
>> +
>> + return buf;
>> +
>> +alloc_failed:
>> + if (!vmaddr) {
>> + kfree(pages);
>> + if (addr)
>> + free_pages_exact(addr, size);
>> + }
>> + vfree(vmaddr);
>> + kfree(buf);
>> + return NULL;
>> +}
>> +
>> +/* Remove the shared mapping and free the buffer */
>> +static void free_quote_buf(struct quote_buf *buf)
>> +{
>> + if (!buf)
>> + return;
>> +
>> + /* Mark pages private */
>> + if (set_memory_encrypted_noalias((unsigned long)buf->vmaddr,
>> + buf->count)) {
>> + pr_warn("Failed to encrypt %d pages at %p", buf->count,
>> + buf->vmaddr);
>> + return;
>> + }
>> +
>> + vfree(buf->vmaddr);
>> + kfree(buf);
>> +}
>> +
>> +static struct quote_entry *alloc_quote_entry(u64 buf_len)
>> +{
>> + struct quote_entry *entry = NULL;
>> +
>> + entry = kmalloc(sizeof(*entry), GFP_KERNEL);
>> + if (!entry)
>> + return NULL;
>> +
>> + /* Allocate buffer for quote request */
>> + entry->buf = alloc_quote_buf(buf_len);
>> + if (!entry->buf) {
>> + kfree(entry);
>> + return NULL;
>> + }
>> +
>> + init_completion(&entry->compl);
>> + entry->valid = true;
>> +
>> + return entry;
>> +}
>> +
>> +static void free_quote_entry(struct quote_entry *entry)
>> +{
>> + free_quote_buf(entry->buf);
>> + kfree(entry);
>> +}
>> +
>> +static void add_quote_entry(struct quote_entry *entry)
>> +{
>> + mutex_lock("e_lock);
>> + list_add_tail(&entry->list, "e_list);
>> + mutex_unlock("e_lock);
>> +}
>> +
>> +static void del_quote_entry(struct quote_entry *entry)
>> +{
>> + mutex_lock("e_lock);
>> + list_del(&entry->list);
>> + mutex_unlock("e_lock);
>> +}
>> +
>> +static void invalidate_quote_request(struct quote_entry *entry)
>> +{
>> + mutex_lock("e_lock);
>> + entry->valid = false;
>> + mutex_unlock("e_lock);
>> +}
>> +
>> +static long tdx_get_quote(void __user *argp)
>> +{
>> + struct quote_entry *entry;
>> + struct tdx_quote_req req;
>> + struct quote_buf *buf;
>> + long ret;
>> +
>> + /* Copy GetQuote request struct from user buffer */
>> + if (copy_from_user(&req, argp, sizeof(struct tdx_quote_req)))
>> + return -EFAULT;
>> +
>> + /* Make sure the length is valid */
>> + if (!req.len)
>> + return -EINVAL;
>> +
>> + entry = alloc_quote_entry(req.len);
>> + if (!entry)
>> + return -ENOMEM;
>> +
>> + buf = entry->buf;
>> +
>> + /* Copy TDREPORT from user buffer to kernel Quote buffer */
>> + if (copy_from_user(buf->vmaddr, (void __user *)req.buf, req.len)) {
>> + free_quote_entry(entry);
>> + return -EFAULT;
>> + }
>> +
>> + /* Submit GetQuote Request */
>> + ret = tdx_get_quote_hypercall(buf);
>> + if (ret) {
>> + pr_err("GetQuote hypercall failed, status:%lx\n", ret);
>> + free_quote_entry(entry);
>> + return -EIO;
>> + }
>> +
>> + /* Add current quote entry to quote_list to track active requests */
>> + add_quote_entry(entry);
>
> There is a race condition. Interrupt can arrive after hypercall and before
> list_add_tail and workqueue can run before reaching add_quote_entry(). lock
> around both hypercall and list_add_tail. i.e. lock, hypercall, list_add_tail,
> unlock.
Good catch!. I will hold the hold the lock before hypercall as you have
suggested.
>
>
>> +
>> + /* Wait for attestation completion */
>> + ret = wait_for_completion_interruptible(&entry->compl);
>> + if (ret < 0) {
>> + /*
>> + * For premature termination, since VMM still owns the
>> + * shared buffer, mark the request invalid to let
>> + * quote_callback_handler() handle the memory cleanup
>> + * function.
>> + */
>> + invalidate_quote_request(entry);
>
> Interrupt can arrive after signal interrupt. So invalidate_quote_request()
> should check if the request is already processed, and return 0 or -EINTR.
> Probably check the state always and del_list under single lock/unlock pair.
Agree. But I think we should return -EINTR for the interrupted case
irrespective of the processed status (so no return 0).
I will hold the lock and handle the cleanup for the processed
status.
>
>
>> + return -EINTR;
>> + }
>> +
>> + /*
>> + * If GetQuote request completed successfully, copy the result
>> + * back to the user and do the cleanup.
>> + */
>> + if (copy_to_user((void __user *)req.buf, buf->vmaddr, req.len))
>> + ret = -EIO;
>
> -EFAULT.
>
>
>> +
>> + /*
>> + * Reaching here means GetQuote request is processed
>> + * successfully. So do the cleanup.
>> + */
>> + del_quote_entry(entry);
>> + free_quote_entry(entry);
>> +
>> + return 0;
>> +}
>> +
>> +static void attestation_callback_handler(void)
>> +{
>> + queue_work(quote_wq, "e_work);
>> +}
>> +
>> +static void quote_callback_handler(struct work_struct *work)
>> +{
>> + struct tdx_quote_hdr *quote_hdr;
>> + struct quote_entry *entry, *next;
>> +
>> + /* Find processed quote request and mark it complete */
>> + mutex_lock("e_lock);
>> + list_for_each_entry_safe(entry, next, "e_list, list) {
>> + quote_hdr = (struct tdx_quote_hdr *)entry->buf->vmaddr;
>> + if (quote_hdr->status == GET_QUOTE_IN_FLIGHT)
>> + continue;
>> + /*
>> + * If user invalidated the current request, remove the
>> + * entry from the quote list and free it. If the request
>> + * is still valid, mark it complete.
>> + */
>> + if (entry->valid) {
>> + complete(&entry->compl);
>> + } else {
>> + list_del(&entry->list);
>> + free_quote_entry(entry);
>> + }
>> + }
>> + mutex_unlock("e_lock);
>> +}
>> +
>> static long tdx_attest_ioctl(struct file *file, unsigned int cmd,
>> unsigned long arg)
>> {
>> @@ -81,6 +364,9 @@ static long tdx_attest_ioctl(struct file *file, unsigned int cmd,
>> case TDX_CMD_GET_REPORT:
>> ret = tdx_get_report(argp);
>> break;
>> + case TDX_CMD_GET_QUOTE:
>> + ret = tdx_get_quote(argp);
>> + break;
>> default:
>> pr_debug("cmd %d not supported\n", cmd);
>> break;
>> @@ -103,6 +389,13 @@ static int __init tdx_attestation_init(void)
>> if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>> return -EIO;
>>
>> + quote_wq = create_singlethread_workqueue("tdx_quote_handler");
>> +
>> + INIT_WORK("e_work, quote_callback_handler);
>> +
>> + /* Register attestation event notify handler */
>> + tdx_setup_ev_notify_handler(attestation_callback_handler);
>> +
>> miscdev.name = DRIVER_NAME;
>> miscdev.minor = MISC_DYNAMIC_MINOR;
>> miscdev.fops = &tdx_attest_fops;
>> diff --git a/arch/x86/include/uapi/asm/tdx.h b/arch/x86/include/uapi/asm/tdx.h
>> index e06da56058a1..76a9e71e7b39 100644
>> --- a/arch/x86/include/uapi/asm/tdx.h
>> +++ b/arch/x86/include/uapi/asm/tdx.h
>> @@ -39,4 +39,49 @@ struct tdx_report_req {
>> */
>> #define TDX_CMD_GET_REPORT _IOWR('T', 0x01, struct tdx_report_req)
>>
>> +/* struct tdx_quote_req: Request to generate TD Quote using TDREPORT
>> + *
>> + * @buf : Pass user data that includes TDREPORT as input. Upon
>> + * successful completion of IOCTL, output is copied
>> + * back to the same buffer.
>> + * @len : Length of the buffer.
>> + */
>> +struct tdx_quote_req {
>> + __u64 buf;
>> + __u64 len;
>> +};
>> +
>> +/*
>> + * TDX_CMD_GET_QUOTE - Get TD Quote from QE/QGS using GetQuote
>> + * TDVMCALL.
>> + *
>> + * Returns 0 on success, -EINTR for interrupted request, and
>> + * standard errono on other failures.
>> + */
>> +#define TDX_CMD_GET_QUOTE _IOR('T', 0x02, struct tdx_quote_req)
>> +
>> +/* TD Quote status codes */
>> +#define GET_QUOTE_SUCCESS 0
>> +#define GET_QUOTE_IN_FLIGHT 0xffffffffffffffff
>> +#define GET_QUOTE_ERROR 0x8000000000000000
>> +#define GET_QUOTE_SERVICE_UNAVAILABLE 0x8000000000000001
>> +
>> +/*
>> + * Format of Quote data header. More details can be found in TDX
>> + * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
>> + * section titled "TDG.VP.VMCALL<GetQuote>"
>> + */
>> +struct tdx_quote_hdr {
>> + /* Quote version, filled by TD */
>> + __u64 version;
>> + /* Status code of Quote request, filled by VMM */
>> + __u64 status;
>> + /* Length of TDREPORT, filled by TD */
>> + __u32 in_len;
>> + /* Length of Quote, filled by VMM */
>> + __u32 out_len;
>> + /* Actual Quote data */
>> + __u64 data[0];
>> +};
>> +
>> #endif /* _UAPI_ASM_X86_TDX_H */
>> --
>> 2.25.1
>>
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Powered by blists - more mailing lists