lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68fc2af6305be_10e210029@dwillia2-mobl4.notmuch>
Date: Fri, 24 Oct 2025 18:42:14 -0700
From: <dan.j.williams@...el.com>
To: Vishal Annapurve <vannapurve@...gle.com>, Dave Hansen
	<dave.hansen@...el.com>
CC: <dan.j.williams@...el.com>, Chao Gao <chao.gao@...el.com>, "Reshetova,
 Elena" <elena.reshetova@...el.com>, "linux-coco@...ts.linux.dev"
	<linux-coco@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>, "Chatre,
 Reinette" <reinette.chatre@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>,
	"Huang, Kai" <kai.huang@...el.com>, "yilun.xu@...ux.intel.com"
	<yilun.xu@...ux.intel.com>, "sagis@...gle.com" <sagis@...gle.com>,
	"paulmck@...nel.org" <paulmck@...nel.org>, "nik.borisov@...e.com"
	<nik.borisov@...e.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
	<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar
	<mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>, Paolo Bonzini
	<pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support

Vishal Annapurve wrote:
> On Fri, Oct 24, 2025 at 2:19 PM Dave Hansen <dave.hansen@...el.com> wrote:
> >
> > On 10/24/25 14:12, dan.j.williams@...el.com wrote:
> > >> The SGX solution, btw, was to at least ensure forward progress (CPUSVN
> > >> update) when the last enclave goes away. So new enclaves aren't
> > >> *prevented* from starting but the window when the first one starts
> > >> (enclave count going from 0->1) is leveraged to do the update.
> > > The status quo does ensure forward progress. The TD does get built and
> > > the update does complete, just the small matter of TD attestation
> > > failures, right?
> 
> I would think that it's not a "small" problem if confidential
> workloads on the hosts are not able to pass attestation.

"Small" as in "not the kernel's problem". Userspace asked for the
update, update is documented to clobber build sometimes, userspace ran
an update anyway. Userspace asked for the clobber.

It would be lovely if this clobbering does not happen at all and the
update mechanism did not come with this misfeature. Otherwise, the kernel
has no interface to solve that problem. The best it can do is document
that this new update facility has this side effect.

Userspace always has the choice to not update, coordinate update with
build, or do nothing and let tenants try to launch again.  Userspace
could even retry the build and hide the tenant failure if it knew about
the clobber, but be clear that the problem is the clobber not the kernel
doing what userspace asked.

The clobber, as I understand, is also limited to cases where the update
includes crypto library changes. I am not sure how often that happens in
practice. Suffice to say, the fact that the clobber is conditioned on
the contents of the update also puts it further away from being a kernel
problem. The clobber does not corrupt kernel state.

> > Oh, yeah, for sure.
> >
> > If we do _nothing_ in the kernel (no build vs. module update
> > synchronization), then the downside is being exposed to attestation
> > failures if userspace either also does nothing or has bugs.
> >
> > That's actually, by far, my preferred solution to this whole mess:
> > Userspace plays stupid games, userspace wins stupid prizes.
> >
> 
> IIUC, enforcing "Avoid updates during update sensitive times" is not
> that complex and will ensure to avoid any issues with user space
> logic.

Userspace logic avoids issues by honoring the documentation that these
ABIs sequences need synchronization. Otherwise, kernel blocking update
during build just trades one error for another.

Treat this like any other userspace solution for requiring "atomic"
semantics when the kernel mechanisms are not themselves designed to be
atomic, wrap it in userspace synchronization.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ