linux-kernel - Re: [PATCH v2 2/2] x86/sgx: Implement EUPDATESVN and opportunistically call it during first EPC page alloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a0a803275d317f88afdd757afa30e84a26b05656.camel@intel.com>
Date: Tue, 8 Apr 2025 00:06:32 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "Reshetova, Elena" <elena.reshetova@...el.com>, "jarkko@...nel.org"
	<jarkko@...nel.org>
CC: "Hansen, Dave" <dave.hansen@...el.com>, "linux-sgx@...r.kernel.org"
	<linux-sgx@...r.kernel.org>, "Scarlata, Vincent R"
	<vincent.r.scarlata@...el.com>, "x86@...nel.org" <x86@...nel.org>,
	"Annapurve, Vishal" <vannapurve@...gle.com>, "bondarn@...gle.com"
	<bondarn@...gle.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "Mallick, Asit K" <asit.k.mallick@...el.com>,
	"Aktas, Erdem" <erdemaktas@...gle.com>, "Cai, Chong" <chongc@...gle.com>,
	"Raynor, Scott" <scott.raynor@...el.com>, "dionnaglaze@...gle.com"
	<dionnaglaze@...gle.com>
Subject: Re: [PATCH v2 2/2] x86/sgx: Implement EUPDATESVN and
 opportunistically call it during first EPC page alloc

On Mon, 2025-04-07 at 08:23 +0000, Reshetova, Elena wrote:
> > On Fri, Apr 04, 2025 at 06:53:17AM +0000, Reshetova, Elena wrote:
> > > > On Wed, Apr 02, 2025 at 01:11:25PM +0000, Reshetova, Elena wrote:
> > > > > > > current SGX kernel code does not handle such errors in any other
> > way
> > > > > > > than notifying that operation failed for other ENCLS leaves. So, I don't
> > > > > > > see why ENCLS[EUPDATESVN] should be different from existing
> > > > behaviour?
> > > > > > 
> > > > > > While not disagreeing fully (it depends on call site), in some
> > > > > > situations it is more difficult to take more preventive actions.
> > > > > > 
> > > > > > This is a situation where we know that there are *zero* EPC pages in
> > > > > > traffic so it is relatively easy to stop the madness, isn't it?
> > > > > > 
> > > > > > I guess the best action would be make sgx_alloc_epc_page() return
> > > > > > consistently -ENOMEM, if the unexpected happens.
> > > > > 
> > > > > But this would be very misleading imo. We do have memory, even page
> > > > > allocation might function as normal in EPC, the only thing that is broken
> > > > > can be EUPDATESVN functionality. Returning -ENOMEM in this case
> > seems
> > > > > wrong.
> > > > 
> > > > This makes it not misleading at all:
> > > > 
> > > > 	pr_err("EUPDATESVN: unknown error %d\n", ret);
> > > > 
> > > > Since hardware should never return this, it indicates a kernel bug.
> > > 
> > > OK, so you propose in this case to print the above message, sgx_updatesvn
> > > returning an error, and then NULL from __sgx_alloc_epc_page_from_node
> > and
> > > the __sgx_alloc_epc_page returning -ENOMEM after an iteration over
> > > a whole set of numa nodes given that we will keep getting the unknown
> > error
> > > on each node upon trying to do an allocation from each one?
> > 
> > I'd disable ioctl's in this case and return -ENOMEM. It's a cheap sanity
> > check. Should not ever happen, but if e.g., a new kernel patch breaks
> > anything, it could help catching issues.
> > 
> > We are talking here about situation that is never expected to happen so I
> > don't think it is too heavy hammer here. Here it makes sense because not
> > much effort is required to implement the counter-measures.
> 
> OK, but does it really make sense to explicitly disable ioctls? 
> Note that everything *in practice* will be disabled simply because not a single page
> anymore can be allocated from EPC since we are getting -ENOMEM on EPC
> page allocation. Also, note that any approach we chose should be symmetrical
> to SGX virtualization side also, which doesn’t use ioctls at all. Simply returning
> -ENOMEM for page allocation in EPC seems like a correct symmetrical solution
> that would work for both nativel enclaves and EPC pages allocated for VMs.
> And nothing would  be able to proceed creating/managing enclaves at this point. 
> 

Right, failing ioctls() doesn't cover SGX virtualization.  If we ever want to
fail, we should fail the EPC allocation.

Btw, for the unknown error, and any other errors which should not happen,
couldn't we use the ENCLS_WARN()?  AFAICT there are already cases that we are
using ENCLS_WARN() for those "impossible-to-happen-errors".

E.g., in __sgx_encl_extend():

	        ret = __eextend(sgx_get_epc_virt_addr(encl->secs.epc_page),
                                sgx_get_epc_virt_addr(epc_page) + offset);
                if (ret) {
                        if (encls_failed(ret))
                                ENCLS_WARN(ret, "EEXTEND");
   
                        return -EIO;
                }