linux-kernel - Re: [RFC PATCH 2/4] x86/sgx: Add basic infrastructure to recover from errors in SGX memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YO3J3fGh8Zx39QH0@google.com>
Date:   Tue, 13 Jul 2021 17:14:05 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Tony Luck <tony.luck@...el.com>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Dave Hansen <dave.hansen@...el.com>,
        Jarkko Sakkinen <jarkko.sakkinen@...el.com>
Subject: Re: [RFC PATCH 2/4] x86/sgx: Add basic infrastructure to recover
 from errors in SGX memory

On Tue, Jun 08, 2021, Tony Luck wrote:
> +	/*
> +	 * Poison was synchronously consumed by an enclave in the current
> +	 * task. Send a SIGBUS here. Hardware should prevent this enclave
> +	 * from being re-entered, so no concern that the poisoned page
> +	 * will be touched a second time. The poisoned EPC page will be
> +	 * dropped as pages are freed during task exit.
> +	 */
> +	if (flags & MF_ACTION_REQUIRED) {
> +		if (epc_page->type == SGX_PAGE_TYPE_REG) {

Why only for REG pages?  I don't see the value added by assuming that SECS, TCS,
and VA pages will result in uncorrectable #MC.

> +			encl_page = epc_page->owner;
> +			addr = encl_page->desc & PAGE_MASK;
> +			force_sig_mceerr(BUS_MCEERR_AR, (void *)addr, PAGE_SHIFT);
> +		} else {
> +			force_sig(SIGBUS);
> +		}
> +		goto unlock;
> +	}
> +
> +	section = &sgx_epc_sections[epc_page->section];
> +	node = section->node;
> +
> +	if (page_flags & SGX_EPC_PAGE_POISON)
> +		goto unlock;
> +
> +	if (page_flags & SGX_EPC_PAGE_DIRTY) {
> +		list_del(&epc_page->list);

As suggested in the prior patch, why not let the sanitizer handle poisoned pages?

> +	} else if (page_flags & SGX_EPC_PAGE_FREE) {
> +		spin_lock(&node->lock);
> +		epc_page->owner = NULL;
> +		list_del(&epc_page->list);
> +		sgx_nr_free_pages--;
> +		spin_unlock(&node->lock);
> +		goto unlock;
> +	}
> +
> +	switch (epc_page->type) {
> +	case SGX_PAGE_TYPE_REG:
> +		encl_page = epc_page->owner;
> +		/* Unmap the page, unless it was already in progress to be freed */
> +		if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) {
> +			spin_unlock(&sgx_reclaimer_lock);
> +			sgx_reclaimer_block(epc_page);
> +			kref_put(&encl_page->encl->refcount, sgx_encl_release);
> +			goto done;
> +		}
> +		break;
> +
> +	case SGX_PAGE_TYPE_KVM:
> +		spin_unlock(&sgx_reclaimer_lock);
> +		sgx_memory_failure_vepc(epc_page);

...

> @@ -217,6 +218,13 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
>  	return 0;
>  }
>  
> +void sgx_memory_failure_vepc(struct sgx_epc_page *epc_page)
> +{
> +	struct sgx_vepc *vepc = epc_page->owner;
> +
> +	send_sig(SIGBUS, vepc->task, false);

...

> +}
> +
>  static int sgx_vepc_open(struct inode *inode, struct file *file)
>  {
>  	struct sgx_vepc *vepc;
> @@ -226,6 +234,7 @@ static int sgx_vepc_open(struct inode *inode, struct file *file)
>  		return -ENOMEM;
>  	mutex_init(&vepc->lock);
>  	xa_init(&vepc->page_array);
> +	vepc->task = current;

This is broken, there is no guarantee whatsoever that the task that opened the
vEPC is the task that is actively using vEPC.  Since Dave successfully lobbied
to allow multiple mm_structs per vEPC, it might not even be the correct process.

>  	file->private_data = vepc;
>  
> -- 
> 2.29.2
>