lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM8PR11MB5750FBF44109B8CAD5574922E78BA@DM8PR11MB5750.namprd11.prod.outlook.com>
Date: Thu, 8 May 2025 14:07:33 +0000
From: "Reshetova, Elena" <elena.reshetova@...el.com>
To: "Hansen, Dave" <dave.hansen@...el.com>
CC: "jarkko@...nel.org" <jarkko@...nel.org>, "seanjc@...gle.com"
	<seanjc@...gle.com>, "Huang, Kai" <kai.huang@...el.com>,
	"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>, "Mallick, Asit K"
	<asit.k.mallick@...el.com>, "Scarlata, Vincent R"
	<vincent.r.scarlata@...el.com>, "Cai, Chong" <chongc@...gle.com>, "Aktas,
 Erdem" <erdemaktas@...gle.com>, "Annapurve, Vishal" <vannapurve@...gle.com>,
	"dionnaglaze@...gle.com" <dionnaglaze@...gle.com>, "bondarn@...gle.com"
	<bondarn@...gle.com>, "Raynor, Scott" <scott.raynor@...el.com>
Subject: RE: [PATCH v4 1/1] x86/sgx: Enable automatic SVN updates for SGX
 enclaves

Thank you very much for your detailed review, Dave!
Responses inline below. 


> On 5/7/25 04:14, Elena Reshetova wrote:
> > In case an SGX vulnerability is discovered and TCB recovery
> > for SGX is triggered, Intel specifies a process that must be
> > followed for a given vulnerability. Steps to mitigate can vary
> > based on vulnerability type, affected components, etc.
> > In some cases, a vulnerability can be mitigated via a runtime
> > recovery flow by shutting down all running SGX enclaves,
> > clearing enclave page cache (EPC), applying a microcode patch
> > that does not require a reboot (via late microcode loading) and
> > restarting all SGX enclaves.
> 
> Plain language and brevity have a lot of value in changelogs. There's a
> substantial amount of irrelevant content here.
> 
> > Even when the above-described runtime recovery flow to mitigate the
> > SGX vulnerability is followed, the SGX attestation evidence will
> > still reflect the security SVN version being equal to the previous
> > state of security SVN (containing vulnerability) that created
> > and managed the enclave until the runtime recovery event. This
> > limitation currently can be only overcome via a platform reboot,
> > which negates all the benefits from the rebootless late microcode
> > loading and not required in this case for functional or security
> > purposes.
> 
> Can this please be trimmed down?

Sure, I can work on trimming the text. 

> 
> Actually, I think I wrote changelogs for this once upon a time. Could
> you please go dig those up and use them?

Could you please suggest where can I find them? Was it for the previous
submission done by Cathy? 

> 
> > SGX architecture introduced a new instruction called ENCLS[EUPDATESVN]
> > to Ice Lake [1].
> 
> Is it really on "Ice Lake" parts? Like, does it show up on
> INTEL_ICELAKE? If not, this is only confusing and mostly irrelevant
> information.

Ok, will remove. 

> 
> > It allows updating security SVN version, given that EPC
> > is completely empty. The latter is required for security reasons
> > in order to reason that enclave security posture is as secure as the
> > security SVN version of the TCB that created it.
> >
> > Additionally it is important to ensure that while ENCLS[EUPDATESVN]
> > runs, no concurrent page creation happens in EPC, because it might
> > result in #GP delivered to the creator. Legacy SW might not be prepared
> > to handle such unexpected #GPs and therefore this patch introduces
> > a locking mechanism in sgx_(vepc_)open to ensure no concurrent EPC
> > allocations can happen.
> >
> > Implement ENCLS[EUPDATESVN] and enable kernel to opportunistically
> > call it during sgx_(vepc_)open().
> 
> 
> > [1]
> > https://cdrdv2.intel.com/v1/dl/getContent/648682?explicitVersion=true
> 
> These become stale almost immediately. Please also cite the document title.

Sure, I can add the title, but I dont have another link. 

> 
> >  arch/x86/include/asm/sgx.h       | 42 ++++++++++++-------
> >  arch/x86/kernel/cpu/sgx/driver.c |  4 ++
> >  arch/x86/kernel/cpu/sgx/encl.c   |  1 +
> >  arch/x86/kernel/cpu/sgx/encls.h  |  5 +++
> >  arch/x86/kernel/cpu/sgx/main.c   | 70
> ++++++++++++++++++++++++++++++++
> >  arch/x86/kernel/cpu/sgx/sgx.h    |  3 ++
> >  arch/x86/kernel/cpu/sgx/virt.c   |  6 +++
> >  7 files changed, 116 insertions(+), 15 deletions(-)
> 
> Gah, how did this get squished back down to a single patch? It was
> multiple patches before. There are multiple logical things going on here
> and they need to be broken out.

Yes, but the code actually did get smaller, so it didn’t feel
like it is worth breaking this into different patches. 
But I will follow your suggestion.

> 
> > diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> > index 6a0069761508..8ac026ef6365 100644
> > --- a/arch/x86/include/asm/sgx.h
> > +++ b/arch/x86/include/asm/sgx.h
> > @@ -27,22 +27,26 @@
> >  /* The bitmask for the EPC section type. */
> >  #define SGX_CPUID_EPC_MASK	GENMASK(3, 0)
> >
> > +/* EUPDATESVN support in CPUID.0x12.0.EAX */
> > +#define SGX_CPUID_EUPDATESVN	BIT(10)
> > +
> >  enum sgx_encls_function {
> > -	ECREATE	= 0x00,
> > -	EADD	= 0x01,
> > -	EINIT	= 0x02,
> > -	EREMOVE	= 0x03,
> > -	EDGBRD	= 0x04,
> > -	EDGBWR	= 0x05,
> > -	EEXTEND	= 0x06,
> > -	ELDU	= 0x08,
> > -	EBLOCK	= 0x09,
> > -	EPA	= 0x0A,
> > -	EWB	= 0x0B,
> > -	ETRACK	= 0x0C,
> > -	EAUG	= 0x0D,
> > -	EMODPR	= 0x0E,
> > -	EMODT	= 0x0F,
> > +	ECREATE		= 0x00,
> > +	EADD		= 0x01,
> > +	EINIT		= 0x02,
> > +	EREMOVE		= 0x03,
> > +	EDGBRD		= 0x04,
> > +	EDGBWR		= 0x05,
> > +	EEXTEND		= 0x06,
> > +	ELDU		= 0x08,
> > +	EBLOCK		= 0x09,
> > +	EPA			= 0x0A,
> > +	EWB			= 0x0B,
> > +	ETRACK		= 0x0C,
> > +	EAUG		= 0x0D,
> > +	EMODPR		= 0x0E,
> > +	EMODT		= 0x0F,
> > +	EUPDATESVN	= 0x18,
> >  };
> >
> >  /**
> > @@ -73,6 +77,11 @@ enum sgx_encls_function {
> >   *				public key does not match
> IA32_SGXLEPUBKEYHASH.
> >   * %SGX_PAGE_NOT_MODIFIABLE:	The EPC page cannot be modified
> because it
> >   *				is in the PENDING or MODIFIED state.
> > + * %SGX_INSUFFICIENT_ENTROPY:	Insufficient entropy in RNG.
> > + * %SGX_EPC_NOT_READY:			EPC is not ready for SVN
> update.
> > + * %SGX_NO_UPDATE:		EUPDATESVN was successful, but
> CPUSVN was not
> > + *				updated because current SVN was not newer
> than
> > + *				CPUSVN.
> >   * %SGX_UNMASKED_EVENT:		An unmasked event, e.g. INTR, was
> received
> >   */
> >  enum sgx_return_code {
> > @@ -81,6 +90,9 @@ enum sgx_return_code {
> >  	SGX_CHILD_PRESENT		= 13,
> >  	SGX_INVALID_EINITTOKEN		= 16,
> >  	SGX_PAGE_NOT_MODIFIABLE		= 20,
> > +	SGX_INSUFFICIENT_ENTROPY	= 29,
> > +	SGX_EPC_NOT_READY			= 30,
> > +	SGX_NO_UPDATE				= 31,
> >  	SGX_UNMASKED_EVENT		= 128,
> >  };
> 
> I'd *much* rather that these mechanical constant introductions and
> mindless refactoring (like reindenting) not be mixed with actual logic code.

Understood. 

> 
> >
> > diff --git a/arch/x86/kernel/cpu/sgx/driver.c
> b/arch/x86/kernel/cpu/sgx/driver.c
> > index 7f8d1e11dbee..669e44d61f9f 100644
> > --- a/arch/x86/kernel/cpu/sgx/driver.c
> > +++ b/arch/x86/kernel/cpu/sgx/driver.c
> > @@ -19,6 +19,10 @@ static int sgx_open(struct inode *inode, struct file
> *file)
> >  	struct sgx_encl *encl;
> >  	int ret;
> >
> > +	ret = sgx_inc_usage_count();
> > +	if (ret)
> > +		return ret;
> > +
> >  	encl = kzalloc(sizeof(*encl), GFP_KERNEL);
> >  	if (!encl)
> >  		return -ENOMEM;
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index 279148e72459..3b54889ae4a4 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -765,6 +765,7 @@ void sgx_encl_release(struct kref *ref)
> >  	WARN_ON_ONCE(encl->secs.epc_page);
> >
> >  	kfree(encl);
> > +	sgx_dec_usage_count();
> >  }
> >
> >  /*
> > diff --git a/arch/x86/kernel/cpu/sgx/encls.h
> b/arch/x86/kernel/cpu/sgx/encls.h
> > index 99004b02e2ed..788acf8ec563 100644
> > --- a/arch/x86/kernel/cpu/sgx/encls.h
> > +++ b/arch/x86/kernel/cpu/sgx/encls.h
> > @@ -233,4 +233,9 @@ static inline int __eaug(struct sgx_pageinfo *pginfo,
> void *addr)
> >  	return __encls_2(EAUG, pginfo, addr);
> >  }
> >
> > +/* Update CPUSVN at runtime. */
> > +static inline int __eupdatesvn(void)
> > +{
> > +	return __encls_ret_1(EUPDATESVN, "");
> > +}
> >  #endif /* _X86_ENCLS_H */
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c
> b/arch/x86/kernel/cpu/sgx/main.c
> > index 8ce352fc72ac..2872567cd22b 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/sysfs.h>
> >  #include <linux/vmalloc.h>
> >  #include <asm/sgx.h>
> > +#include <asm/archrandom.h>
> >  #include "driver.h"
> >  #include "encl.h"
> >  #include "encls.h"
> > @@ -914,6 +915,73 @@ int sgx_set_attribute(unsigned long
> *allowed_attributes,
> >  }
> >  EXPORT_SYMBOL_GPL(sgx_set_attribute);
> >
> > +static bool sgx_has_eupdatesvn;
> 
> We have CPUID "caches" of sorts. Why open code this?

You mean X86_FEATURE_*? 

SGX CPUID is only defined in SGX code currently (btw, I am not sure
why they are made special) so it doesn’t support this.
Is this a past decision that you want to change? 

> 
> > +static atomic_t sgx_usage_count;
> 
> Is 32 bits enough for sgx_usage_count? What goes boom when it overflows?

I can increase the value easily, but even with current number we are talking
about 2^32 enclaves, i dont think this is realistic to happen in practice. 

> 
> > +static DEFINE_MUTEX(sgx_svn_lock);
> 
> What does this lock protect?

Will add description. 

> 
> > +/**
> > + * sgx_updatesvn() - Issue ENCLS[EUPDATESVN]
> > + * If EPC is empty, this instruction will update CPUSVN to the currently
> > + * loaded microcode update SVN and generate new cryptographic assets.
> 
> This is *NOT* EUPDATESVN. "this instruction" is not what's happening here.
> 
> sgx_updatesvn() _tries_ to update the SVN. Most of the time, there will
> be no update and that's OK. This should only be called with EPC is empty.

Will fix description. 

> 
> > + * Return:
> > + * 0: Success or not supported
> > + * errno on error
> 
> I'm not a big fan of filling thing out just because.
> 
> What value is there in saying "errno on error"?

Will remove. 

> 
> > + */
> > +static int sgx_update_svn(void)
> > +{
> > +	int retry = RDRAND_RETRY_LOOPS;
> > +	int ret;
> > +
> > +	if (!sgx_has_eupdatesvn)
> > +		return 0;
> 
> This looks goofy. Why is it OK to just silently ignore an update
> request? (I know the answer, but it needs to be obvious)

Will add the explicit comment. 

> 
> > +	do {
> > +		ret = __eupdatesvn();
> > +	} while (ret == SGX_INSUFFICIENT_ENTROPY && --retry);
> 
> 
> 	for (i = 0; i < RDRAND_RETRY_LOOPS; i++) {
> 		ret = __eupdatesvn();
> 
> 		/* Stop on success or unexpected errors: */
> 		if (ret != SGX_INSUFFICIENT_ENTROPY)
> 			break;
> 	}

Will change.

> 
> > +	if (!ret || ret == SGX_NO_UPDATE) {
> > +		/*
> > +		 * SVN successfully updated, or it was already up-to-date.
> > +		 * Let users know when the update was successful.
> > +		 */
> > +		if (!ret)
> > +			pr_info("SVN updated successfully\n");
> > +		return 0;
> > +	}
> 
> Isn't this a lot simpler?
> 
> 	if (ret == SGX_NO_UPDATE)
> 		return 0;
> 
> 	if (!ret) {
> 		pr_info("SVN updated successfully\n");
> 		return 0;
> 	}

Sure, will change. 

> 
> > +	/*
> > +	 * EUPDATESVN was called when EPC is empty, all other error
> > +	 * codes are unexcepted except running out of entropy.
> 
> 			^ unexpected
> 
> Spell check, please.

Will fix. 

> 
> > +	 */
> > +	if (ret != SGX_INSUFFICIENT_ENTROPY)
> > +		ENCLS_WARN(ret, "EUPDATESVN");
> > +	return ret;
> > +}
> 
> The indentation here is backwards. The error paths should be indented
> and the success path at the lowest indent whenever possible. This, for
> example:

Will fix. 

> 
> 	if (ret == SGX_NO_UPDATE)
> 		return 0;
> 
> 	if (ret) {
> 		ENCLS_WARN(ret, "EUPDATESVN");
> 		return ret;
> 	}
> 
> 	pr_info("SVN updated successfully\n");
> 	return 0;
> 
> Oh, and do we expect SGX_INSUFFICIENT_ENTROPY all the time? I thought it
> was supposed to be horribly rare. Shouldn't we warn on it?

The entropy comes from RDSEED in this case, not RDRAND.
This is not a horribly rare but very realistic failure. 

> 
> > +int sgx_inc_usage_count(void)
> > +{
> 
> No comments, eh?
> 
> What does success _mean_? What does failure mean?

Will add. 

> 
> > +	int ret;
> > +
> > +	if (atomic_inc_not_zero(&sgx_usage_count))
> > +		return 0;
> > +
> > +	guard(mutex)(&sgx_svn_lock);
> > +
> > +	if (atomic_inc_not_zero(&sgx_usage_count))
> > +		return 0;
> > +
> > +	ret = sgx_update_svn();
> > +	if (!ret)
> > +		atomic_inc(&sgx_usage_count);
> > +	return ret;
> > +}
> 
> Gah, this is 100% *NOT* obvious what's going on. Yet there are zero
> comments on it. The lock is uncommented. The atomic is uncommented.
> 
> What does any of this mean? What do the components do?

Will fix. 

> 
> > +
> > +void sgx_dec_usage_count(void)
> > +{
> > +	atomic_dec(&sgx_usage_count);
> > +}
> 
> 
> 
> >  static int __init sgx_init(void)
> >  {
> >  	int ret;
> > @@ -947,6 +1015,8 @@ static int __init sgx_init(void)
> >  	if (sgx_vepc_init() && ret)
> >  		goto err_provision;
> >
> > +	sgx_has_eupdatesvn = (cpuid_eax(SGX_CPUID) &
> SGX_CPUID_EUPDATESVN);
> > +
> >  	return 0;
> >
> >  err_provision:
> > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> > index d2dad21259a8..f5940393d9bd 100644
> > --- a/arch/x86/kernel/cpu/sgx/sgx.h
> > +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> > @@ -102,6 +102,9 @@ static inline int __init sgx_vepc_init(void)
> >  }
> >  #endif
> >
> > +int sgx_inc_usage_count(void);
> > +void sgx_dec_usage_count(void);
> > +
> >  void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
> >
> >  #endif /* _X86_SGX_H */
> > diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> > index 7aaa3652e31d..e548de340c2e 100644
> > --- a/arch/x86/kernel/cpu/sgx/virt.c
> > +++ b/arch/x86/kernel/cpu/sgx/virt.c
> > @@ -255,12 +255,18 @@ static int sgx_vepc_release(struct inode *inode,
> struct file *file)
> >  	xa_destroy(&vepc->page_array);
> >  	kfree(vepc);
> >
> > +	sgx_dec_usage_count();
> >  	return 0;
> >  }
> >
> >  static int sgx_vepc_open(struct inode *inode, struct file *file)
> >  {
> >  	struct sgx_vepc *vepc;
> > +	int ret;
> > +
> > +	ret = sgx_inc_usage_count();
> > +	if (ret)
> > +		return ret;
> >
> >  	vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
> >  	if (!vepc)
> 
> I think I'd do this in at least 4 patches:
> 
> 1. Introduce the usage count tracking: the atomic and the open/release
>    "hooks", maybe without error handling on the open() side
> 2. Introduce the EUPDATESVN mechanical bits: the CPUID bit, the
>    enumeration, the bool, the new error enum values
> 3. Introduce the mechanical eupdatesvn function. The retry loop and the
>    "no entropy" handling
> 4. Plumb #3 into #1
> 
> #4 is your place to argue if EUPDATESVN failures should cascade to open().

I will use this breakout for the next version, thank you!

Best Regards,
Elena.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ