linux-kernel - Re: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <D1A9QP6G0PW4.2HI60Q8GUT5YE@kernel.org>
Date: Wed, 15 May 2024 16:56:30 +0300
From: "Jarkko Sakkinen" <jarkko@...nel.org>
To: "Jarkko Sakkinen" <jarkko@...nel.org>, "Dmitrii Kuvaiskii"
 <dmitrii.kuvaiskii@...el.com>, <dave.hansen@...ux.intel.com>,
 <kai.huang@...el.com>, <haitao.huang@...ux.intel.com>,
 <reinette.chatre@...el.com>, <linux-sgx@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>
Cc: <mona.vij@...el.com>, <kailun.qin@...el.com>, <stable@...r.kernel.org>,
 Marcelina Kościelnicka <mwk@...isiblethingslab.com>
Subject: Re: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread
 returns SIGBUS

On Wed May 15, 2024 at 4:54 PM EEST, Jarkko Sakkinen wrote:
> On Wed May 15, 2024 at 4:12 PM EEST, Dmitrii Kuvaiskii wrote:
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index 279148e72459..41f14b1a3025 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -382,8 +382,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> >  	 * If ret == -EBUSY then page was created in another flow while
> >  	 * running without encl->lock
> >  	 */
> > -	if (ret)
> > +	if (ret) {
> > +		if (ret == -EBUSY)
> > +			vmret = VM_FAULT_NOPAGE;
> >  		goto err_out_shrink;
> > +	}
>
> I agree that there is a bug but it does not categorize as race
> condition.
>
> The bug is simply that for a valid page SIGBUS might be returned.
> The fix is correct but the claim is not.
>
> >  
> >  	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
> >  	pginfo.addr = encl_page->desc & PAGE_MASK;
> > @@ -419,7 +422,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> >  err_out_shrink:
> >  	sgx_encl_shrink(encl, va_page);
> >  err_out_epc:
> > -	sgx_encl_free_epc_page(epc_page);
> > +	sgx_free_epc_page(epc_page);
> >  err_out_unlock:
> >  	mutex_unlock(&encl->lock);
> >  	kfree(encl_page);
>
> Agree with code change 100% but not with the description.
>
> I'd cut out 90% of the description out and just make the argument of
> the wrong error code, and done. The sequence is great for showing
> how this could happen. The prose makes my head hurt tbh.

Also please remember that stable maintainers need to read all of that
if this is a bug fix (it is a bug fix!) :-) So shorted possible legit
argument, no prose and the sequence was awesome :-)

BR, Jarkko