lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGtprH9bLpQQ_2UOOShd15hPwMqwW+gwo1TzczLbwGdNkcJHhg@mail.gmail.com>
Date: Thu, 23 Oct 2025 13:31:50 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Chao Gao <chao.gao@...el.com>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>, 
	"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>, 
	"Chatre, Reinette" <reinette.chatre@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>, 
	"Huang, Kai" <kai.huang@...el.com>, "Williams, Dan J" <dan.j.williams@...el.com>, 
	"yilun.xu@...ux.intel.com" <yilun.xu@...ux.intel.com>, "sagis@...gle.com" <sagis@...gle.com>, 
	"paulmck@...nel.org" <paulmck@...nel.org>, "nik.borisov@...e.com" <nik.borisov@...e.com>, 
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>, 
	Paolo Bonzini <pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>, 
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support

On Wed, Oct 22, 2025 at 8:42 AM Vishal Annapurve <vannapurve@...gle.com> wrote:
>
> On Wed, Oct 22, 2025 at 12:15 AM Chao Gao <chao.gao@...el.com> wrote:
> >
> > On Fri, Oct 17, 2025 at 05:01:55PM -0700, Vishal Annapurve wrote:
> > >On Fri, Oct 17, 2025 at 3:08 AM Reshetova, Elena
> > ><elena.reshetova@...el.com> wrote:
> > >>
> > >>
> > >> > > > ...
> > >> > > > > But the situation can be avoided fully, if TD preserving update is not
> > >> > > > conducted
> > >> > > > > during the TD build time.
> > >> > > >
> > >> > > > Sure, and the TDX module itself could guarantee this as well as much as
> > >> > > > the kernel could. It could decline to allow module updates during TD
> > >> > > > builds, or error out the TD build if it collides with an update.
> > >> > >
> > >> > > TDX module has a functionality to decline going into SHUTDOWN state
> > >> > > (pre-requisite for TD preserving update) if TD build or any problematic
> > >> > > operation is in progress. It requires VMM to opt-in into this feature.
> > >> >
> > >> > Is this opt-in enabled as part of this series? If not, what is the
> > >> > mechanism to enable this opt-in?
> > >>
> > >> For the information about how it works on TDX module side,
> > >> please consult the latest ABI spec, definition of TDH.SYS.SHUTDOWN leaf,
> > >> page 321:
> > >> https://cdrdv2.intel.com/v1/dl/getContent/733579
> > >>
> > >
> > >Thanks Elena. Should the patch [1] from this series be modified to
> > >handle the TDX module shutdown as per:
> >
> > Hi Vishal,
> >
> > I will fix this issue in the next version.
> >
> > The plan is to opt in post-update compatibility detection in the TDX
> > Module. If incompatibilities are found, the module will return errors to
> > any TD build or migration operations that were initiated prior to the
> > updates. Please refer to the TDH.SYS.UPDATE leaf definition in the ABI
> > spec above for details.
> >
> > I prefer this approach because:
> >
> > a. it guarantees forward progress. In contrast, failing updates would
> >    require admins to retry TDX Module updates, and no progress would be
> >    made unless they can successfully avoid race conditions between TDX
> >    module updates and TD build/migration operations. However, if such race
> >    conditions could be reliably prevented, this issue wouldn't require a
> >    fix in the first place.
>
> TD build operations are much more frequent than TDX module update
> operations. Retrying TD build operation will need additional KVM and
> userspace VMM changes IIUC (assuming TD build process needs to be
> restarted from the scratch). IMO, it would be simpler to handle TDX
> module update failures by retrying.
>
> Admin logic to update TDX modules can be designed to either retry
> failed TDX module updates or to be more robust, adds some
> synchronization with VM creation attempts on the host. i.e. I think
> it's fine to punt this problem of ensuring the forward progress to
> user-space admin logic on the host.

Discussed offline with Erdem Aktas on this. From Google's perspective
"Avoid updates during updatesensitive times" seems a better option as
I mentioned above.

To avoid having to choose which policy to enforce in kernel, a better
way could be to:
* Allow user space opt-in for "Avoid updates during updatesensitive times" AND
* Allow user space opt-in for "Detect incompatibility after update" as well OR
* Keep "Detect incompatibility after update" enabled by default based
on the appetite for avoiding silent corruption scenarios.

>
> >
> > b. it eliminates false alarms that could occur with the "block update"
> >    approach. Under the "block update" approach, updates would be rejected
> >    whenever TD build operations are running, regardless of whether the new
> >    module is actually compatible (e.g., when using the same crypto library as
> >    the current module). In contrast, the post-update detection approach only
> >    fails TD build or migration operations when genuine incompatibilities
> >    exist.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ