[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM8PR11MB5750C80ED4AB80C696F581EFE7BD2@DM8PR11MB5750.namprd11.prod.outlook.com>
Date: Wed, 16 Apr 2025 12:06:56 +0000
From: "Reshetova, Elena" <elena.reshetova@...el.com>
To: "Huang, Kai" <kai.huang@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>
CC: "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, "Scarlata,
Vincent R" <vincent.r.scarlata@...el.com>, "x86@...nel.org" <x86@...nel.org>,
"jarkko@...nel.org" <jarkko@...nel.org>, "Annapurve, Vishal"
<vannapurve@...gle.com>, "Cai, Chong" <chongc@...gle.com>, "Mallick, Asit K"
<asit.k.mallick@...el.com>, "Aktas, Erdem" <erdemaktas@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bondarn@...gle.com" <bondarn@...gle.com>, "dionnaglaze@...gle.com"
<dionnaglaze@...gle.com>, "Raynor, Scott" <scott.raynor@...el.com>
Subject: RE: [PATCH v3 2/2] x86/sgx: Implement EUPDATESVN and
opportunistically call it during first EPC page alloc
> On Tue, 2025-04-15 at 14:51 +0300, Elena Reshetova wrote:
> > SGX architecture introduced a new instruction called EUPDATESVN
>
> "a new ENCLS leaf function EUPDATESVN"?
Yes, you are right, better wording, will fix.
>
> > to Ice Lake. It allows updating security SVN version, given that EPC
> > is completely empty. The latter is required for security reasons
> > in order to reason that enclave security posture is as secure as the
> > security SVN version of the TCB that created it.
> >
> > Additionally it is important to ensure that while ENCLS[EUPDATESVN]
> > runs, no concurrent page creation happens in EPC, because it might
> > result in #GP delivered to the creator. Legacy SW might not be prepared
> > to handle such unexpected #GPs and therefore this patch introduces
> > a locking mechanism to ensure no concurrent EPC allocations can happen.
> >
> > It is also ensured that ENCLS[EUPDATESVN] is not called when running
> > in a VM since it does not have a meaning in this context (microcode
> > updates application is limited to the host OS) and will create
> > unnecessary load.
> >
> > This patch is based on previous submision by Cathy Zhang
> > https://lore.kernel.org/all/20220520103904.1216-1-cathy.zhang@intel.com/
>
> When Jarkko suggested to use "Link" here:
>
> https://lore.kernel.org/lkml/Z983ZaTaWNqFUpYS@kernel.org/
>
> I think he meant something like below:
>
> This patch is based on ... [1]
>
> Link: https://lore.kernel.org/all/20220520103904.1216-1-
> cathy.zhang@...el.com/
> [1]
Oh, I see, I didn’t understand this. Can fix in the next version.
>
> >
> > Signed-off-by: Elena Reshetova <elena.reshetova@...el.com>
> > ---
> > arch/x86/include/asm/sgx.h | 41 +++++++++++------
> > arch/x86/kernel/cpu/sgx/encls.h | 6 +++
> > arch/x86/kernel/cpu/sgx/main.c | 82
> ++++++++++++++++++++++++++++++++-
> > arch/x86/kernel/cpu/sgx/sgx.h | 1 +
> > 4 files changed, 114 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> > index 6a0069761508..5caf5c31ebc6 100644
> > --- a/arch/x86/include/asm/sgx.h
> > +++ b/arch/x86/include/asm/sgx.h
> > @@ -26,23 +26,26 @@
> > #define SGX_CPUID_EPC_SECTION 0x1
> > /* The bitmask for the EPC section type. */
> > #define SGX_CPUID_EPC_MASK GENMASK(3, 0)
> > +/* EUPDATESVN presence indication */
> > +#define SGX_CPUID_EUPDATESVN BIT(10)
>
> I am not sure whether this should be a new X86_FEATURE bit, just like other
> SGX
> bits:
>
> X86_FEATURE_SGX1
> X86_FEATURE_SGX2
> X86_FEATURE_SGX_EDECCSSA
>
> .. reported in CPUID.0x12.0x0:EAX.
>
> But this is never going to be exposed to VMs, i.e., KVM should never need to
> use
> it, so perhaps fine to just put it here.
I am fine either way. I can change this if people consider it is a better fit.
>
> [...]
>
>
> >
> > +/* This lock is held to prevent new EPC pages from being created
> > + * during the execution of ENCLS[EUPDATESVN].
> > + */
>
> /*
> * This lock ...
> * ...
> */
Thank you for catching this! I thought that I fixed all comments to this
Format but obvious I missed this one. Sad that checkpatch doesn’t
complain on this (
>
> > +static DEFINE_SPINLOCK(sgx_epc_eupdatesvn_lock);
> > +
> > static atomic_long_t sgx_nr_used_pages = ATOMIC_LONG_INIT(0);
> > static unsigned long sgx_nr_total_pages;
> >
> > @@ -444,6 +449,7 @@ static struct sgx_epc_page
> *__sgx_alloc_epc_page_from_node(int nid)
> > {
> > struct sgx_numa_node *node = &sgx_numa_nodes[nid];
> > struct sgx_epc_page *page = NULL;
> > + bool ret;
> >
> > spin_lock(&node->lock);
> >
> > @@ -452,12 +458,33 @@ static struct sgx_epc_page
> *__sgx_alloc_epc_page_from_node(int nid)
> > return NULL;
> > }
> >
> > + if (!atomic_long_inc_not_zero(&sgx_nr_used_pages)) {
> > + spin_lock(&sgx_epc_eupdatesvn_lock);
> > + /*
> > + * Only call sgx_updatesvn() once the first enclave's
> > + * page is allocated from EPC
> > + */
>
> The VMs can pre-populate EPC w/o having any enclave being created inside.
> How
> about just:
>
> /*
> * Make sure SVN is up-to-date before any EPC page is
> * allocated for any enclave.
> */
>
Right, agree with this wording change, good point!
> > + if (atomic_long_read(&sgx_nr_used_pages) == 0) {
> > + ret = sgx_updatesvn();
> > + if (!ret) {
> > + /*
> > + * sgx_updatesvn() returned unknown error,
> smth
> > + * must be broken, do not allocate a page
> from EPC
> > + */
>
> Strictly speaking, sgx_updatesvn() returns true or false, but not "unknown
> error".
Yes, will fix the wording.
>
> Reading below, it could also fail due to running out of entropy, which is still
> legally possible to happen IMHO.
Actually, no in this case, we only return false from sgx_updatesvn in case unknown
error happens as agreed previously. In case we run out of entropy it should be safe
to retry later and we dont prevent per current code EPC page allocation.
>
> Maybe just:
> /*
> * Updating SVN failed. SGX might be broken,
> * or running out of entropy happened. Do
> not
> * allocate EPC page since it is not safe to
> use
> * SGX anymore if it was the former. If it was
> * due to running out of entropy, the further
> * call of EPC allocation will try
> * sgx_updatesvn() again.
> */
I agree with this except that the current code doesn’t prevent EPC allocation on any
other error return than unknown error. The question is whenever we want to
change the behaviour to require it?
>
> > + spin_unlock(&sgx_epc_eupdatesvn_lock);
> > + spin_unlock(&node->lock);
> > + return NULL;
> > + }
> > + }
> > + atomic_long_inc(&sgx_nr_used_pages);
> > + spin_unlock(&sgx_epc_eupdatesvn_lock);
> > + }
> > +
> > page = list_first_entry(&node->free_page_list, struct sgx_epc_page,
> list);
> > list_del_init(&page->list);
> > page->flags = 0;
> >
> > spin_unlock(&node->lock);
> > - atomic_long_inc(&sgx_nr_used_pages);
> >
> > return page;
> > }
> > @@ -970,3 +997,56 @@ static int __init sgx_init(void)
> > }
> >
> > device_initcall(sgx_init);
> > +
> > +/**
> > + * sgx_updatesvn() - Issue ENCLS[EUPDATESVN]
> > + * If EPC is ready, this instruction will update CPUSVN to the currently
>
> Not sure how to interpret "EPC is ready". Assume it means "EPC is empty", in
> which case should we just say so?
Yes, correct, it means EPC is empty, will fix.
>
> It is consistent with what said in the changelog anyway.
>
> > + * loaded microcode update SVN and generate new cryptographic assets.
> > + *
> > + * Return:
> > + * True: success or EUPDATESVN can be safely retried next time
> > + * False: Unexpected error occurred
>
> Hmm.. IIUC it could fail with running out of entropy but this is still legally
> possible to happen. And it is safe to retry.
Yes, in this case we get back SGX_INSUFFICIENT_ENTROPY currently we
return "true" here and do not prevent EPC allocations of the page in this
case, which means we will start populate EPC and can next time retry only
when EPC is empty again.
>
> > + */
> > +bool sgx_updatesvn(void)
> > +{
> > + int retry = 10;
>
> "10" is a bit out of blue.
>
> We can use RDRAND_RETRY_LOOPS in <asm/archrandom.h> instead:
>
> #define RDRAND_RETRY_LOOPS 10
I can change it to this value.
>
> > + int ret;
> > +
> > + lockdep_assert_held(&sgx_epc_eupdatesvn_lock);
> > +
> > + if (!(cpuid_eax(SGX_CPUID) & SGX_CPUID_EUPDATESVN))
> > + return true;
> > +
> > + /*
> > + * Do not execute ENCLS[EUPDATESVN] if running in a VM since
> > + * microcode updates are only meaningful to be applied on the host.
> > + */
> > + if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> > + return true;
> > +
> > + do {
> > + ret = __eupdatesvn();
> > + if (ret != SGX_INSUFFICIENT_ENTROPY)
> > + break;
> > +
>
> The blank line here is not needed.
Will fix
>
> > + } while (--retry);
> > +
> > + switch (ret) {
> > + case 0:
> > + pr_info("EUPDATESVN: success\n");
> > + break;
> > + case SGX_EPC_NOT_READY:
> > + case SGX_INSUFFICIENT_ENTROPY:
> > + case SGX_EPC_PAGE_CONFLICT:
> > + ENCLS_WARN(ret, "EUPDATESVN");
> > + break;
>
> I don't think we should use ENCLS_WARN() for SGX_INSUFFICIENT_ENTROPY,
> since
> IIUC it is still legally possible to happen after the above retry.
>
> Also, it doesn't seem SGX_EPC_NOT_READY and SGX_EPC_PAGE_CONFLICT
> are needed
> since IIUC the only possible error is out of entropy.
Well, in case we have a kernel bug, and we think EPC is empty and it is safe
to execute EUPDATESVN, while it is not the case, we can still get back the
SGX_EPC_NOT_READY and SGX_EPC_PAGE_CONFLICT from HW, right?
Which means we probably should warn about such buggy cases.
And maybe we should also prevent page allocation from EPC in this case also
similarly as for unknown error?
>
> > + case SGX_NO_UPDATE:
> > + break;
> > + default:
> > + ENCLS_WARN(ret, "EUPDATESVN");
> > + return false;
> > + }
> > +
> > + return true;
> > +
>
> Please remove this blank line.
Sure, thanks for catching!
>
> > +}
> > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> > index d2dad21259a8..d7d631c973d0 100644
> > --- a/arch/x86/kernel/cpu/sgx/sgx.h
> > +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> > @@ -103,5 +103,6 @@ static inline int __init sgx_vepc_init(void)
> > #endif
> >
> > void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
> > +bool sgx_updatesvn(void);
>
> Seems sgx_updatesvn() is only called by __sgx_alloc_epc_page_from_node().
> Can
> it be made static?
Yes, will do.
Powered by blists - more mailing lists