[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGtprH9UTqC-wmOhfjr2qNk2X-BDJokmLYjET=Zm+Zu+QHZ6Dw@mail.gmail.com>
Date: Fri, 31 Oct 2025 10:57:20 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Sagi Shahar <sagis@...gle.com>
Cc: Chao Gao <chao.gao@...el.com>, linux-coco@...ts.linux.dev,
linux-kernel@...r.kernel.org, x86@...nel.org, reinette.chatre@...el.com,
ira.weiny@...el.com, kai.huang@...el.com, dan.j.williams@...el.com,
yilun.xu@...ux.intel.com, paulmck@...nel.org, nik.borisov@...e.com,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support
On Fri, Oct 31, 2025 at 9:55 AM Sagi Shahar <sagis@...gle.com> wrote:
>
> On Tue, Sep 30, 2025 at 9:54 PM Chao Gao <chao.gao@...el.com> wrote:
> >
> > Changelog:
> > v1->v2:
> > - Replace tdx subsystem with a "tdx-host" device implementation
> > - Reorder patches to reduce reviewer's mental "list of things to look out for"
> > - Replace "TD-Preserving update" with "runtime TDX Module Update"
> > - Drop the temporary "td_preserving_ready" flag
> > - Move low-level SEAMCALL helpers to its own header file
> > - Don't create a new, inferior framework to save/restore VMCS
> > - Minor cleanups and changelog improvements for clarity and consistency
> > - Collect review tags
> > - I didn't add Sagi Shahar's Tested-by due to various changes/reorder etc.
> > - v1: https://lore.kernel.org/kvm/20250523095322.88774-1-chao.gao@intel.com/
> >
> > Hi Reviewers,
> >
> > This series adds support for runtime TDX Module updates that preserve
> > running TDX guests.
> >
> > == Background ==
> >
> > Intel TDX isolates Trusted Domains (TDs), or confidential guests, from the
> > host. A key component of Intel TDX is the TDX Module, which enforces
> > security policies to protect the memory and CPU states of TDs from the
> > host. However, the TDX Module is software that require updates.
> >
> > == Problems ==
> >
> > Currently, the TDX Module is loaded by the BIOS at boot time, and the only
> > way to update it is through a reboot, which results in significant system
> > downtime. Users expect the TDX Module to be updatable at runtime without
> > disrupting TDX guests.
> >
> > == Solution ==
> >
> > On TDX platforms, P-SEAMLDR[1] is a component within the protected SEAM
> > range. It is loaded by the BIOS and provides the host with functions to
> > install a TDX Module at runtime.
> >
> > Implement a TDX Module update facility via the fw_upload mechanism. Given
> > that there is variability in which module update to load based on features,
> > fix levels, and potentially reloading the same version for error recovery
> > scenarios, the explicit userspace chosen payload flexibility of fw_upload
> > is attractive.
> >
> > This design allows the kernel to accept a bitstream instead of loading a
> > named file from the filesystem, as the module selection and policy
> > enforcement for TDX Modules are quite complex (see more in patch 8). By
> > doing so, much of this complexity is shifted out of the kernel. The kernel
> > need to expose information, such as the TDX Module version, to userspace.
> > Userspace must understand the TDX Module versioning scheme and update
> > policy to select the appropriate TDX Module (see "TDX Module Versioning"
> > below).
> >
> > In the unlikely event the update fails, for example userspace picks an
> > incompatible update image, or the image is otherwise corrupted, all TDs
> > will experience SEAMCALL failures and be killed. The recovery of TD
> > operation from that event requires a reboot.
> >
> > Given there is no mechanism to quiesce SEAMCALLs, the TDs themselves must
> > pause execution over an update. The most straightforward way to meet the
> > 'pause TDs while update executes' constraint is to run the update in
> > stop_machine() context. All other evaluated solutions export more
> > complexity to KVM, or exports more fragility to userspace.
> >
> > == How to test this series ==
> >
> > This series can be tested using the userspace tool that is able to
> > select the appropriate TDX module and install it via the interfaces
> > exposed by this series:
> >
> > # git clone https://github.com/intel/tdx-module-binaries
> > # cd tdx-module-binaries
> > # python version_select_and_load.py --update
> >
> > == Base commit ==
> >
> > This series is based on:
> > https://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm.git/commit/?h=tdx&id=9332e088937f
>
> Can you clarify which patches are needed from this tree? Is it just
> "coco/tdx-host: Introduce a "tdx_host" device" or is this series also
> depends on other patches?
>
> More specifically, does this series depend on "Move VMXON/VMXOFF
> handling from KVM to CPU lifecycle"?
>
Hi Chao,
Is this non-RFC series dependent on RFC patches?
What's the intended order of upstreaming the features and dependencies
being discussed here?
Powered by blists - more mailing lists