lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAutUaQvgEliXPUs@google.com>
Date: Fri, 25 Apr 2025 10:40:39 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Elena Reshetova <elena.reshetova@...el.com>
Cc: "jarkko@...nel.org" <jarkko@...nel.org>, Kai Huang <kai.huang@...el.com>, 
	Dave Hansen <dave.hansen@...el.com>, 
	"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>, Vincent Scarlata <vincent.r.scarlata@...el.com>, 
	"x86@...nel.org" <x86@...nel.org>, Vishal Annapurve <vannapurve@...gle.com>, Chong Cai <chongc@...gle.com>, 
	Asit K Mallick <asit.k.mallick@...el.com>, Erdem Aktas <erdemaktas@...gle.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "bondarn@...gle.com" <bondarn@...gle.com>, 
	"dionnaglaze@...gle.com" <dionnaglaze@...gle.com>, Scott Raynor <scott.raynor@...el.com>
Subject: Re: [PATCH v3 2/2] x86/sgx: Implement EUPDATESVN and
 opportunistically call it during first EPC page alloc

On Fri, Apr 25, 2025, Elena Reshetova wrote:
> > On Thu, Apr 24, 2025, Elena Reshetova wrote:
> > Userspace generally won't care about a 10us delay when destroying a
> > process, but a 10us delay to launch an enclave could be quite problematic,
> > e.g. in the TDX use case where enclaves may be launched on-demand in
> > response to a guest attestation request.
> 
> Ok, but in this case, you are hooking both: create and release. 
> I guess your line of thinking is that in most of the cases it will be handled by
> a release flow when destroying enclaves, but in cases we happen to need
> to update a machine which doesn’t have a single enclave yet, the create flow
> will be used. The problem is that if you look at the instruction definition, 
> we won't save too much when executing this in release flow (Kai I think already pointed
> this out):
> 
> IF (Other instruction is accessing EPC) THEN
>  RFLAGS.ZF := 1
>  RAX := SGX_LOCKFAIL;
>  GOTO ERROR_EXIT;
> FI
> (* Verify EPC is ready *)
> IF (the EPC contains any valid pages) THEN
>  RFLAGS.ZF := 1;
>  RAX := SGX_EPC_NOT_READY;
>  GOTO ERROR_EXIT;
> FI
> (* Refresh paging key *)  
> IF (NOT RDSEED(&TMP_KEY, 16)) THEN
>  RFLAGS.ZF := 1;
>  RAX := SGX_INSUFFICIENT_ENTROPY;
> GOTO ERROR_EXIT;
> FI
> (* Commit *)
> CR_BASE_KEY := TMP_KEY;
> TMP_CPUSVN := CR_CPUSVN;
> (* Update CPUSVN to current minimum patch even if locked *)
> (* Determine if info status is needed *)
> 
> ----------------------------
> All above would be done anyhow on create even if we successfully
> executed it on release previously (( And then finally we go into:

Ah, so the slow part happens no matter what.

> IF (TMP_CPUSVN = CR_CPUSVN) THEN
>  RFLAGS.CF := 1;
>  RAX := SGX_NO_UPDATE;
> FI
> ERROR_EXIT:
> 
> > 
> > If the update time is tiny, then I agree that hooking release would probably do
> > more harm than good.
> 
> So, it is not that the time is tiny, it is like we will do it twice,
> unnecessary and potentially quite long in both cases (taking RDSEED step into
> account).
> 
> The reason why the instruction is defined this way is that it was not
> intended originally to be inserted into some existing SGX flows, but was
> envisioned to be standalone cmd for the host orchestrator to execute once it
> thinks system is ready. 

So then why on earth is the kernel implementing automatic updates?  I read back
through most of the cover letters, and IIUC, we went straight from "destroy all
enclaves and force an update" to "blindly try to do EUPDATESVN every time the
number of enclaves goes from 0=>1".  Those are essentially the two most extreme
options.

Even worse, rejecting enclave creation on EUPDATESVN failure represents an ABI
change, i.e. could break existing setups.

Why not simply add an ioctl() or sysfs knob to let userspace trigger EUPDATESVN?
If there are pages in the EPC, return -EBUSY.  That would give userspace full
control of the update policy, and would allow for a super simple implementation
in the kernel.  Userspace should darn well know when it's an appropriate time to
do an SVN update, e.g. after userspace has initiated a ucode update, periodically
if it wants to, after the last TDX VM is destroyed, etc.

Assuming TDX module updates are coming down the pipe, with my KVM maintainer hat
on, that is exactly how I will "request" the module be updated.  Userspace initiates
the update and is therefore responsible for knowing when to do (or not do) an update.
The kernel's job is purely to do the actual update and ensure correctness/safety,
e.g. reject the module update if there are active TDX VMs.

That is also how SEV firmware updates are being implemented.  Userspace is
responsible for stopping any VM types that aren't compatible with conccurent
updates.

I see no reason to do something different for SGX.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ