linux-kernel - Re: [PATCH v6 5/5] x86/tdx: Add Quote generation support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c07c9fa1-ff6a-df8a-6050-60ade29367f3@linux.intel.com>
Date:   Fri, 13 May 2022 12:29:55 -0700
From:   Sathyanarayanan Kuppuswamy 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Isaku Yamahata <isaku.yamahata@...il.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H . Peter Anvin" <hpa@...or.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Tony Luck <tony.luck@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kai Huang <kai.huang@...el.com>,
        Wander Lairson Costa <wander@...hat.com>,
        marcelo.cerri@...onical.com, tim.gardner@...onical.com,
        khalid.elmously@...onical.com, philip.cox@...onical.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 5/5] x86/tdx: Add Quote generation support



On 5/13/22 11:58 AM, Isaku Yamahata wrote:
> On Thu, May 12, 2022 at 03:19:52PM -0700,
> Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
> 
>> In TDX guest, the second stage in attestation process is to send the
>> TDREPORT to QE/QGS to generate the TD Quote. For platforms that does
>> not support communication channels like vsock or TCP/IP, implement
>> support to get TD Quote using hypercall. GetQuote hypercall can be used
>> by the TD guest to request VMM facilitate the Quote generation via
>> QE/QGS. More details about GetQuote hypercall can be found in TDX
>> Guest-Host Communication Interface (GHCI) for Intel TDX 1.0, section
>> titled "TDG.VP.VMCALL<GetQuote>.
>>
>> Since GetQuote is an asynchronous request hypercall, it will not block
>> till the TD Quote is generated. So VMM uses callback interrupt vector
>> configured by SetupEventNotifyInterrupt hypercall to notify the guest
>> about Quote generation completion or failure.
>>
>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
>> with TDREPORT data as input, which is further used by the VMM to copy
>> the TD Quote result after successful Quote generation. To create the
>> shared buffer without breaking the direct map, allocate physically
>> contiguous kernel memory and create a virtual mapping for it using
>> vmap(). set_memory_*crypted_noalias() functions can be used to share or
>> unshare the vmapped page without affecting the direct map.
>>
>> Also note that, shared buffer allocation is currently handled in IOCTL
>> handler, although it will increase the TDX_CMD_GET_QUOTE IOCTL response
>> time, it is negligible compared to the time required for the quote
>> generation completion. So IOCTL performance optimization is not
>> considered at this time.
>>
>> For shared buffer allocation, alternatives like using the DMA API is
>> also considered. Although it simpler to use, it is not preferred because
>> dma_alloc_*() APIs require a valid bus device as argument, which would
>> need converting the attestation driver into a platform device driver.
>> This is unnecessary, and since the attestation driver does not do real
>> DMA, there is no need to use real DMA APIs.
>>
>> Add support for TDX_CMD_GET_QUOTE IOCTL to allow attestation agent
>> submit GetQuote requests from the user space. Since Quote generation
>> is an asynchronous request, IOCTL will block indefinitely for the VMM
>> response in wait_for_completion_interruptible() call. Using this call
>> will also add an option for the user to end the current request
>> prematurely by raising any signals. This can be used by attestation
>> agent to implement Quote generation timeout feature. If attestation
>> agent is aware of time it can validly wait for QE/QGS response, then
>> a possible timeout support can be implemented in the user application
>> using signals. Quote generation timeout feature is currently not
>> implemented in the driver because the current TDX specification does
>> not have any recommendation for it.
>>
>> After submitting the GetQuote request using hypercall, the shared buffer
>> allocated for the current request is owned by the VMM. So, during this
>> wait window, if the user terminates the request by raising a signal or
>> by terminating the application, add a logic to do the memory cleanup
>> after receiving the VMM response at a later time. Such memory cleanup
>> support requires accepting the page again using TDX_ACCEPT_PAGE TDX
>> Module call. So to not overload the callback IRQ handler, move the
>> callback handler logic to a separate work queue.
>>
>> To support parallel GetQuote requests, use linked list to track the
>> active GetQuote requests and upon receiving the callback IRQ, loop
>> through the active requests and mark the processed requests complete.
>> Users can open multiple instances of the attestation device and send
>> GetQuote requests in parallel.
>>
>> Reviewed-by: Tony Luck <tony.luck@...el.com>
>> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
>> Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>> ---
>>   arch/x86/coco/tdx/attest.c      | 293 ++++++++++++++++++++++++++++++++
>>   arch/x86/include/uapi/asm/tdx.h |  45 +++++
>>   2 files changed, 338 insertions(+)
>>
>> diff --git a/arch/x86/coco/tdx/attest.c b/arch/x86/coco/tdx/attest.c
>> index a5f4111f9b18..5531a1834f8c 100644
>> --- a/arch/x86/coco/tdx/attest.c
>> +++ b/arch/x86/coco/tdx/attest.c
>> @@ -13,16 +13,51 @@
>>   #include <linux/miscdevice.h>
>>   #include <linux/mm.h>
>>   #include <linux/io.h>
>> +#include <linux/set_memory.h>
>> +#include <linux/mutex.h>
>>   #include <asm/tdx.h>
>> +#include <asm/coco.h>
>>   #include <uapi/asm/tdx.h>
>>   
>>   #define DRIVER_NAME "tdx-attest"
>>   
>>   /* TDREPORT module call leaf ID */
>>   #define TDX_GET_REPORT			4
>> +/* GetQuote hypercall leaf ID */
>> +#define TDVMCALL_GET_QUOTE             0x10002
>> +
>> +/* Used for buffer allocation in GetQuote request */
>> +struct quote_buf {
>> +	void *vmaddr;
>> +	int count;
>> +};
>> +
>> +/* List entry of quote_list*/
>> +struct quote_entry {
>> +	bool valid;
>> +	struct quote_buf *buf;
>> +	struct list_head list;
>> +	struct completion compl;
>> +};
>>   
>>   static struct miscdevice miscdev;
>>   
>> +/*
>> + * To support parallel GetQuote requests, use the list
>> + * to track active GetQuote requests.
>> + */
>> +static LIST_HEAD(quote_list);
>> +
>> +/* Lock to protect quote_list */
>> +static DEFINE_MUTEX(quote_lock);
>> +
>> +/*
>> + * Workqueue to handle Quote data after Quote generation
>> + * notification from VMM.
>> + */
>> +struct workqueue_struct *quote_wq;
>> +struct work_struct quote_work;
>> +
>>   static long tdx_get_report(void __user *argp)
>>   {
>>   	void *reportdata = NULL, *tdreport = NULL;
>> @@ -71,6 +106,254 @@ static long tdx_get_report(void __user *argp)
>>   	return ret;
>>   }
>>   
>> +/* tdx_get_quote_hypercall() - Request to get TD Quote using TDREPORT */
>> +static long tdx_get_quote_hypercall(struct quote_buf *buf)
>> +{
>> +	struct tdx_hypercall_args args = {0};
>> +
>> +	args.r10 = TDX_HYPERCALL_STANDARD;
>> +	args.r11 = TDVMCALL_GET_QUOTE;
>> +	args.r12 = cc_mkdec(page_to_phys(vmalloc_to_page(buf->vmaddr)));
>> +	args.r13 = buf->count * PAGE_SIZE;
>> +
>> +	/*
>> +	 * Pass the physical address of TDREPORT to the VMM and
>> +	 * trigger the Quote generation. It is not a blocking
>> +	 * call, hence completion of this request will be notified to
>> +	 * the TD guest via a callback interrupt. More info about ABI
>> +	 * can be found in TDX Guest-Host-Communication Interface
>> +	 * (GHCI), sec titled "TDG.VP.VMCALL<GetQuote>".
>> +	 */
>> +	return __tdx_hypercall(&args, 0);
>> +}
>> +
>> +/*
>> + * alloc_quote_buf() - Used to allocate a shared buffer of
>> + *		       given size.
>> + *
>> + * Size is page aligned and the allocated memory is decrypted
>> + * to allow VMM access to it. Uses VMAP to create a shared mapping
>> + * for the buffer to not affect the direct map.
>> + */
>> +static struct quote_buf *alloc_quote_buf(u64 req_size)
>> +{
>> +	int size = PAGE_ALIGN(req_size);
>> +	void *addr = NULL, *vmaddr = NULL;
>> +	int count = size >> PAGE_SHIFT;
>> +	struct page **pages = NULL;
>> +	struct quote_buf *buf;
>> +	int i;
>> +
>> +	buf = kmalloc(sizeof(*buf), GFP_KERNEL);
>> +	if (!buf)
>> +		return NULL;
>> +
>> +	addr = alloc_pages_exact(size, GFP_KERNEL);
>> +	if (!addr)
>> +		goto alloc_failed;
>> +
>> +	/* Allocate mem for array of page ptrs */
>> +	pages = kcalloc(count, sizeof(*pages), GFP_KERNEL);
>> +	if (!pages)
>> +		goto alloc_failed;
>> +
>> +	for (i = 0; i < count; i++)
>> +		pages[i] = virt_to_page(addr + i * PAGE_SIZE);
>> +
>> +	/*
>> +	 * Use VMAP to create shared mapping without affecting
>> +	 * the direct map. Use VM_MAP_PUT_PAGES to allow vmap()
>> +	 * responsible for freeing the pages when using vfree().
>> +	 */
>> +	vmaddr = vmap(pages, count, VM_MAP_PUT_PAGES, PAGE_KERNEL);
>> +	if (!vmaddr)
>> +		goto alloc_failed;
>> +
>> +	/* Use noalias variant to not affect the direct mapping */
>> +	if (set_memory_decrypted_noalias((unsigned long)vmaddr, count))
>> +		goto alloc_failed;
>> +
>> +	buf->vmaddr = vmaddr;
>> +	buf->count = count;
>> +
>> +	return buf;
>> +
>> +alloc_failed:
>> +	if (!vmaddr) {
>> +		kfree(pages);
>> +		if (addr)
>> +			free_pages_exact(addr, size);
>> +	}
>> +	vfree(vmaddr);
>> +	kfree(buf);
>> +	return NULL;
>> +}
>> +
>> +/* Remove the shared mapping and free the buffer */
>> +static void free_quote_buf(struct quote_buf *buf)
>> +{
>> +	if (!buf)
>> +		return;
>> +
>> +	/* Mark pages private */
>> +	if (set_memory_encrypted_noalias((unsigned long)buf->vmaddr,
>> +				buf->count)) {
>> +		pr_warn("Failed to encrypt %d pages at %p", buf->count,
>> +				buf->vmaddr);
>> +		return;
>> +	}
>> +
>> +	vfree(buf->vmaddr);
>> +	kfree(buf);
>> +}
>> +
>> +static struct quote_entry *alloc_quote_entry(u64 buf_len)
>> +{
>> +	struct quote_entry *entry = NULL;
>> +
>> +	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
>> +	if (!entry)
>> +		return NULL;
>> +
>> +	/* Allocate buffer for quote request */
>> +	entry->buf = alloc_quote_buf(buf_len);
>> +	if (!entry->buf) {
>> +		kfree(entry);
>> +		return NULL;
>> +	}
>> +
>> +	init_completion(&entry->compl);
>> +	entry->valid = true;
>> +
>> +	return entry;
>> +}
>> +
>> +static void free_quote_entry(struct quote_entry *entry)
>> +{
>> +	free_quote_buf(entry->buf);
>> +	kfree(entry);
>> +}
>> +
>> +static void add_quote_entry(struct quote_entry *entry)
>> +{
>> +	mutex_lock(&quote_lock);
>> +	list_add_tail(&entry->list, &quote_list);
>> +	mutex_unlock(&quote_lock);
>> +}
>> +
>> +static void del_quote_entry(struct quote_entry *entry)
>> +{
>> +	mutex_lock(&quote_lock);
>> +	list_del(&entry->list);
>> +	mutex_unlock(&quote_lock);
>> +}
>> +
>> +static void invalidate_quote_request(struct quote_entry *entry)
>> +{
>> +	mutex_lock(&quote_lock);
>> +	entry->valid = false;
>> +	mutex_unlock(&quote_lock);
>> +}
>> +
>> +static long tdx_get_quote(void __user *argp)
>> +{
>> +	struct quote_entry *entry;
>> +	struct tdx_quote_req req;
>> +	struct quote_buf *buf;
>> +	long ret;
>> +
>> +	/* Copy GetQuote request struct from user buffer */
>> +	if (copy_from_user(&req, argp, sizeof(struct tdx_quote_req)))
>> +		return -EFAULT;
>> +
>> +	/* Make sure the length is valid */
>> +	if (!req.len)
>> +		return -EINVAL;
>> +
>> +	entry = alloc_quote_entry(req.len);
>> +	if (!entry)
>> +		return -ENOMEM;
>> +
>> +	buf = entry->buf;
>> +
>> +	/* Copy TDREPORT from user buffer to kernel Quote buffer */
>> +	if (copy_from_user(buf->vmaddr, (void __user *)req.buf, req.len)) {
>> +		free_quote_entry(entry);
>> +		return -EFAULT;
>> +	}
>> +
>> +	/* Submit GetQuote Request */
>> +	ret = tdx_get_quote_hypercall(buf);
>> +	if (ret) {
>> +		pr_err("GetQuote hypercall failed, status:%lx\n", ret);
>> +		free_quote_entry(entry);
>> +		return -EIO;
>> +	}
>> +
>> +	/* Add current quote entry to quote_list to track active requests */
>> +	add_quote_entry(entry);
> 
> There is a race condition. Interrupt can arrive after hypercall and before
> list_add_tail and workqueue can run before reaching add_quote_entry().  lock
> around both hypercall and list_add_tail. i.e. lock, hypercall, list_add_tail,
> unlock.

Good catch!. I will hold the hold the lock before hypercall as you have
suggested.

> 
> 
>> +
>> +	/* Wait for attestation completion */
>> +	ret = wait_for_completion_interruptible(&entry->compl);
>> +	if (ret < 0) {
>> +		/*
>> +		 * For premature termination, since VMM still owns the
>> +		 * shared buffer, mark the request invalid to let
>> +		 * quote_callback_handler() handle the memory cleanup
>> +		 * function.
>> +		 */
>> +		invalidate_quote_request(entry);
> 
> Interrupt can arrive after signal interrupt.  So invalidate_quote_request()
> should check if the request is already processed, and return 0 or -EINTR.
> Probably check the state always and del_list under single lock/unlock pair.

Agree. But I think we should return -EINTR for the interrupted case
irrespective of the processed status (so no return 0).

I will hold the lock and handle the cleanup for the processed
status.

> 
> 
>> +		return -EINTR;
>> +	}
>> +
>> +	/*
>> +	 * If GetQuote request completed successfully, copy the result
>> +	 * back to the user and do the cleanup.
>> +	 */
>> +	if (copy_to_user((void __user *)req.buf, buf->vmaddr, req.len))
>> +		ret = -EIO;
> 
> -EFAULT.
> 
> 
>> +
>> +	/*
>> +	 * Reaching here means GetQuote request is processed
>> +	 * successfully. So do the cleanup.
>> +	 */
>> +	del_quote_entry(entry);
>> +	free_quote_entry(entry);
>> +
>> +	return 0;
>> +}
>> +
>> +static void attestation_callback_handler(void)
>> +{
>> +	queue_work(quote_wq, &quote_work);
>> +}
>> +
>> +static void quote_callback_handler(struct work_struct *work)
>> +{
>> +	struct tdx_quote_hdr *quote_hdr;
>> +	struct quote_entry *entry, *next;
>> +
>> +	/* Find processed quote request and mark it complete */
>> +	mutex_lock(&quote_lock);
>> +	list_for_each_entry_safe(entry, next, &quote_list, list) {
>> +		quote_hdr = (struct tdx_quote_hdr *)entry->buf->vmaddr;
>> +		if (quote_hdr->status == GET_QUOTE_IN_FLIGHT)
>> +			continue;
>> +		/*
>> +		 * If user invalidated the current request, remove the
>> +		 * entry from the quote list and free it. If the request
>> +		 * is still valid, mark it complete.
>> +		 */
>> +		if (entry->valid) {
>> +			complete(&entry->compl);
>> +		} else {
>> +			list_del(&entry->list);
>> +			free_quote_entry(entry);
>> +		}
>> +	}
>> +	mutex_unlock(&quote_lock);
>> +}
>> +
>>   static long tdx_attest_ioctl(struct file *file, unsigned int cmd,
>>   			     unsigned long arg)
>>   {
>> @@ -81,6 +364,9 @@ static long tdx_attest_ioctl(struct file *file, unsigned int cmd,
>>   	case TDX_CMD_GET_REPORT:
>>   		ret = tdx_get_report(argp);
>>   		break;
>> +	case TDX_CMD_GET_QUOTE:
>> +		ret = tdx_get_quote(argp);
>> +		break;
>>   	default:
>>   		pr_debug("cmd %d not supported\n", cmd);
>>   		break;
>> @@ -103,6 +389,13 @@ static int __init tdx_attestation_init(void)
>>   	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>>   		return -EIO;
>>   
>> +	quote_wq = create_singlethread_workqueue("tdx_quote_handler");
>> +
>> +	INIT_WORK(&quote_work, quote_callback_handler);
>> +
>> +	/* Register attestation event notify handler */
>> +	tdx_setup_ev_notify_handler(attestation_callback_handler);
>> +
>>   	miscdev.name = DRIVER_NAME;
>>   	miscdev.minor = MISC_DYNAMIC_MINOR;
>>   	miscdev.fops = &tdx_attest_fops;
>> diff --git a/arch/x86/include/uapi/asm/tdx.h b/arch/x86/include/uapi/asm/tdx.h
>> index e06da56058a1..76a9e71e7b39 100644
>> --- a/arch/x86/include/uapi/asm/tdx.h
>> +++ b/arch/x86/include/uapi/asm/tdx.h
>> @@ -39,4 +39,49 @@ struct tdx_report_req {
>>    */
>>   #define TDX_CMD_GET_REPORT		_IOWR('T', 0x01, struct tdx_report_req)
>>   
>> +/* struct tdx_quote_req: Request to generate TD Quote using TDREPORT
>> + *
>> + * @buf		: Pass user data that includes TDREPORT as input. Upon
>> + *		  successful completion of IOCTL, output is copied
>> + *		  back to the same buffer.
>> + * @len		: Length of the buffer.
>> + */
>> +struct tdx_quote_req {
>> +	__u64 buf;
>> +	__u64 len;
>> +};
>> +
>> +/*
>> + * TDX_CMD_GET_QUOTE - Get TD Quote from QE/QGS using GetQuote
>> + *		       TDVMCALL.
>> + *
>> + * Returns 0 on success, -EINTR for interrupted request, and
>> + * standard errono on other failures.
>> + */
>> +#define TDX_CMD_GET_QUOTE		_IOR('T', 0x02, struct tdx_quote_req)
>> +
>> +/* TD Quote status codes */
>> +#define GET_QUOTE_SUCCESS		0
>> +#define GET_QUOTE_IN_FLIGHT		0xffffffffffffffff
>> +#define GET_QUOTE_ERROR			0x8000000000000000
>> +#define GET_QUOTE_SERVICE_UNAVAILABLE	0x8000000000000001
>> +
>> +/*
>> + * Format of Quote data header. More details can be found in TDX
>> + * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
>> + * section titled "TDG.VP.VMCALL<GetQuote>"
>> + */
>> +struct tdx_quote_hdr {
>> +	/* Quote version, filled by TD */
>> +	__u64 version;
>> +	/* Status code of Quote request, filled by VMM */
>> +	__u64 status;
>> +	/* Length of TDREPORT, filled by TD */
>> +	__u32 in_len;
>> +	/* Length of Quote, filled by VMM */
>> +	__u32 out_len;
>> +	/* Actual Quote data */
>> +	__u64 data[0];
>> +};
>> +
>>   #endif /* _UAPI_ASM_X86_TDX_H */
>> -- 
>> 2.25.1
>>
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer