lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250523095322.88774-1-chao.gao@intel.com>
Date: Fri, 23 May 2025 02:52:23 -0700
From: Chao Gao <chao.gao@...el.com>
To: linux-coco@...ts.linux.dev,
	x86@...nel.org,
	kvm@...r.kernel.org
Cc: seanjc@...gle.com,
	pbonzini@...hat.com,
	eddie.dong@...el.com,
	kirill.shutemov@...el.com,
	dave.hansen@...el.com,
	dan.j.williams@...el.com,
	kai.huang@...el.com,
	isaku.yamahata@...el.com,
	elena.reshetova@...el.com,
	rick.p.edgecombe@...el.com,
	Chao Gao <chao.gao@...el.com>,
	Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Ingo Molnar <mingo@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>
Subject: [RFC PATCH 00/20] TD-Preserving updates

Hi Reviewers,

This series adds support for runtime TDX module updates that preserve
running TDX guests (a.k.a, TD-Preserving updates). The goal is to gather
feedback on the feature design. Please pay attention to the following items:

1. TD-Preserving updates are done in stop_machine() context. it copy-pastes
   part of multi_cpu_stop() to guarantee step-locked progress on all CPUs.
   But, there are a few differences between them. I am wondering whether
   these differences have reached a point where abstracting a common
   function might do more harm than good. See more details in patch 10.

2. P-SEAMLDR seamcalls (specificially SEAMRET from P-SEAMLDR) clear current
   VMCS pointers, which may disrupt KVM. To prevent VMX instructions in IRQ
   context from encountering NULL current-VMCS pointers, P-SEAMLDR
   seamcalls are called with IRQ disabled. I'm uncertain if NMIs could
   cause a problem, but I believe they won't. See more information in patch 3.

3. Two helpers, cpu_vmcs_load() and cpu_vmcs_store(), are added in patch 3
   to save and restore the current VMCS. KVM has a variant of cpu_vmcs_load(),
   i.e., vmcs_load(). Extracting KVM's version would cause a lot of code
   churn, and I don't think that can be justified for reducing ~16 LoC
   duplication. Please let me know if you disagree.

== Background ==

Intel TDX isolates Trusted Domains (TDs), or confidential guests, from the
host. A key component of Intel TDX is the TDX module, which enforces
security policies to protect the memory and CPU states of TDs from the
host. However, the TDX module is software that require updates, it is not
device firmware in the typical sense.

== Problems ==

Currently, the TDX module is loaded by the BIOS at boot time, and the only
way to update it is through a reboot, which results in significant system
downtime. Users expect the TDX module to be updatable at runtime without
disrupting TDX guests.

== Solution ==

On TDX platforms, P-SEAMLDR[1] is a component within the protected SEAM
range. It is loaded by the BIOS and provides the host with functions to
install a TDX module at runtime.

Implement a TDX Module update facility via the fw_upload mechanism. Given
that there is variability in which module update to load based on features,
fix levels, and potentially reloading the same version for error recovery
scenarios, the explicit userspace chosen payload flexibility of fw_upload
is attractive.

This design allows the kernel to accept a bitstream instead of loading a
named file from the filesystem, as the module selection and policy
enforcement for TDX modules are quite complex (see more in patch 8). By
doing so, much of this complexity is shifted out of the kernel. The kernel
need to expose information, such as the TDX module version, to userspace.
The userspace tool must understand the TDX module versioning scheme and
update policy to select the appropriate TDX module (see "TDX Module
Versioning" below).

In the unlikely event the update fails, for example userspace picks an
incompatible update image, or the image is otherwise corrupted, all TDs
will experience SEAMCALL failures and be killed. The recovery of TD
operation from that event requires a reboot.

Given there is no mechanism to quiesce SEAMCALLs, the TDs themselves must
pause execution over an update. The most straightforward way to meet the
'pause TDs while update executes' constraint is to run the update in
stop_machine() context. All other evaluated solutions export more
complexity to KVM, or exports more fragility to userspace.

== How to test this series ==

 # git clone https://github.com/intel/tdx-module-binaries
 # cd tdx-module-binaries
 # python version_select_and_load.py --update


This series is based on Sean's kvm-x86/next branch

  https://github.com/kvm-x86/linux.git next


== Other information relevant to TD-Preserving updates == 

=== TDX module versioning ===

Each TDX module is assigned a version number x.y.z, where x represents the
"major" version, y the "minor" version, and z the "update" version.

TD-Preserving updates are restricted to Z-stream releases.

Note that Z-stream releases do not necessarily guarantee compatibility. A
new release may not be compatible with all previous versions. To address this,
Intel provides a separate file containing compatibility information, which
specifies the minimum module version required for a particular update. This
information is referenced by the tool to determine if two modules are
compatible.

=== TCB Stability ===

Updates change the TCB as viewed by attestation reports. In TDX there is a
distinction between launch-time version and current version where TD-preserving
updates cause that latter version number to change, subject to Z-stream
constraints. The need for runtime updates and the implications of that version
change in the attestation was previously discussed in [3].

=== TDX Module Distribution Model ===

At a high level, Intel publishes all TDX modules on the github [2], along with
a mapping_file.json which documents the compatibility information about each
TDX module and a script to install the TDX module. OS vendors can package
these modules and distribute them. Administrators install the package and
use the script to select the appropriate TDX module and install it via the
interfaces exposed by this series.

[1]: https://cdrdv2.intel.com/v1/dl/getContent/733584
[2]: https://github.com/intel/tdx-module-binaries
[3]: https://lore.kernel.org/all/5d1da767-491b-4077-b472-2cc3d73246d6@amazon.com/


Chao Gao (20):
  x86/virt/tdx: Print SEAMCALL leaf numbers in decimal
  x86/virt/tdx: Prepare to support P-SEAMLDR SEAMCALLs
  x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs
  x86/virt/tdx: Introduce a "tdx" subsystem and "tsm" device
  x86/virt/tdx: Export tdx module attributes via sysfs
  x86/virt/seamldr: Add a helper to read P-SEAMLDR information
  x86/virt/tdx: Expose SEAMLDR information via sysfs
  x86/virt/seamldr: Implement FW_UPLOAD sysfs ABI for TD-Preserving
    Updates
  x86/virt/seamldr: Allocate and populate a module update request
  x86/virt/seamldr: Introduce skeleton for TD-Preserving updates
  x86/virt/seamldr: Abort updates if errors occurred midway
  x86/virt/seamldr: Shut down the current TDX module
  x86/virt/tdx: Reset software states after TDX module shutdown
  x86/virt/seamldr: Install a new TDX module
  x86/virt/seamldr: Handle TD-Preserving update failures
  x86/virt/seamldr: Do TDX cpu init after updates
  x86/virt/tdx: Establish contexts for the new module
  x86/virt/tdx: Update tdx_sysinfo and check features post-update
  x86/virt/seamldr: Verify availability of slots for TD-Preserving
    updates
  x86/virt/seamldr: Enable TD-Preserving Updates

 Documentation/ABI/testing/sysfs-devices-tdx |  32 ++
 MAINTAINERS                                 |   1 +
 arch/x86/Kconfig                            |  12 +
 arch/x86/include/asm/tdx.h                  |  20 +-
 arch/x86/include/asm/tdx_global_metadata.h  |  12 +
 arch/x86/virt/vmx/tdx/Makefile              |   1 +
 arch/x86/virt/vmx/tdx/seamldr.c             | 443 ++++++++++++++++++++
 arch/x86/virt/vmx/tdx/seamldr.h             |  16 +
 arch/x86/virt/vmx/tdx/tdx.c                 | 248 ++++++++++-
 arch/x86/virt/vmx/tdx/tdx.h                 |  12 +
 arch/x86/virt/vmx/tdx/tdx_global_metadata.c |  29 ++
 arch/x86/virt/vmx/vmx.h                     |  40 ++
 12 files changed, 862 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-tdx
 create mode 100644 arch/x86/virt/vmx/tdx/seamldr.c
 create mode 100644 arch/x86/virt/vmx/tdx/seamldr.h
 create mode 100644 arch/x86/virt/vmx/vmx.h

-- 
2.47.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ