linux-kernel - Re: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <D1A9PC6LWL2S.38KB2X3EL9X79@kernel.org>
Date: Wed, 15 May 2024 16:54:43 +0300
From: "Jarkko Sakkinen" <jarkko@...nel.org>
To: "Dmitrii Kuvaiskii" <dmitrii.kuvaiskii@...el.com>,
 <dave.hansen@...ux.intel.com>, <kai.huang@...el.com>,
 <haitao.huang@...ux.intel.com>, <reinette.chatre@...el.com>,
 <linux-sgx@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Cc: <mona.vij@...el.com>, <kailun.qin@...el.com>, <stable@...r.kernel.org>,
 Marcelina Kościelnicka <mwk@...isiblethingslab.com>
Subject: Re: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread
 returns SIGBUS

On Wed May 15, 2024 at 4:12 PM EEST, Dmitrii Kuvaiskii wrote:
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 279148e72459..41f14b1a3025 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -382,8 +382,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	 * If ret == -EBUSY then page was created in another flow while
>  	 * running without encl->lock
>  	 */
> -	if (ret)
> +	if (ret) {
> +		if (ret == -EBUSY)
> +			vmret = VM_FAULT_NOPAGE;
>  		goto err_out_shrink;
> +	}

I agree that there is a bug but it does not categorize as race
condition.

The bug is simply that for a valid page SIGBUS might be returned.
The fix is correct but the claim is not.

>  
>  	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
>  	pginfo.addr = encl_page->desc & PAGE_MASK;
> @@ -419,7 +422,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  err_out_shrink:
>  	sgx_encl_shrink(encl, va_page);
>  err_out_epc:
> -	sgx_encl_free_epc_page(epc_page);
> +	sgx_free_epc_page(epc_page);
>  err_out_unlock:
>  	mutex_unlock(&encl->lock);
>  	kfree(encl_page);

Agree with code change 100% but not with the description.

I'd cut out 90% of the description out and just make the argument of
the wrong error code, and done. The sequence is great for showing
how this could happen. The prose makes my head hurt tbh.

BR, Jarkko