[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGtprH8q5U6h3p5iDYtwRiyVG_xF8hDwq6G34hLt-jhe+MRNaA@mail.gmail.com>
Date: Wed, 22 Oct 2025 08:42:13 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Chao Gao <chao.gao@...el.com>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
"Chatre, Reinette" <reinette.chatre@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>,
"Huang, Kai" <kai.huang@...el.com>, "Williams, Dan J" <dan.j.williams@...el.com>,
"yilun.xu@...ux.intel.com" <yilun.xu@...ux.intel.com>, "sagis@...gle.com" <sagis@...gle.com>,
"paulmck@...nel.org" <paulmck@...nel.org>, "nik.borisov@...e.com" <nik.borisov@...e.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support
On Wed, Oct 22, 2025 at 12:15 AM Chao Gao <chao.gao@...el.com> wrote:
>
> On Fri, Oct 17, 2025 at 05:01:55PM -0700, Vishal Annapurve wrote:
> >On Fri, Oct 17, 2025 at 3:08 AM Reshetova, Elena
> ><elena.reshetova@...el.com> wrote:
> >>
> >>
> >> > > > ...
> >> > > > > But the situation can be avoided fully, if TD preserving update is not
> >> > > > conducted
> >> > > > > during the TD build time.
> >> > > >
> >> > > > Sure, and the TDX module itself could guarantee this as well as much as
> >> > > > the kernel could. It could decline to allow module updates during TD
> >> > > > builds, or error out the TD build if it collides with an update.
> >> > >
> >> > > TDX module has a functionality to decline going into SHUTDOWN state
> >> > > (pre-requisite for TD preserving update) if TD build or any problematic
> >> > > operation is in progress. It requires VMM to opt-in into this feature.
> >> >
> >> > Is this opt-in enabled as part of this series? If not, what is the
> >> > mechanism to enable this opt-in?
> >>
> >> For the information about how it works on TDX module side,
> >> please consult the latest ABI spec, definition of TDH.SYS.SHUTDOWN leaf,
> >> page 321:
> >> https://cdrdv2.intel.com/v1/dl/getContent/733579
> >>
> >
> >Thanks Elena. Should the patch [1] from this series be modified to
> >handle the TDX module shutdown as per:
>
> Hi Vishal,
>
> I will fix this issue in the next version.
>
> The plan is to opt in post-update compatibility detection in the TDX
> Module. If incompatibilities are found, the module will return errors to
> any TD build or migration operations that were initiated prior to the
> updates. Please refer to the TDH.SYS.UPDATE leaf definition in the ABI
> spec above for details.
>
> I prefer this approach because:
>
> a. it guarantees forward progress. In contrast, failing updates would
> require admins to retry TDX Module updates, and no progress would be
> made unless they can successfully avoid race conditions between TDX
> module updates and TD build/migration operations. However, if such race
> conditions could be reliably prevented, this issue wouldn't require a
> fix in the first place.
TD build operations are much more frequent than TDX module update
operations. Retrying TD build operation will need additional KVM and
userspace VMM changes IIUC (assuming TD build process needs to be
restarted from the scratch). IMO, it would be simpler to handle TDX
module update failures by retrying.
Admin logic to update TDX modules can be designed to either retry
failed TDX module updates or to be more robust, adds some
synchronization with VM creation attempts on the host. i.e. I think
it's fine to punt this problem of ensuring the forward progress to
user-space admin logic on the host.
>
> b. it eliminates false alarms that could occur with the "block update"
> approach. Under the "block update" approach, updates would be rejected
> whenever TD build operations are running, regardless of whether the new
> module is actually compatible (e.g., when using the same crypto library as
> the current module). In contrast, the post-update detection approach only
> fails TD build or migration operations when genuine incompatibilities
> exist.
Powered by blists - more mailing lists