lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXis76vQhWi3RvEB@intel.com>
Date: Tue, 27 Jan 2026 20:17:51 +0800
From: Chao Gao <chao.gao@...el.com>
To: <dan.j.williams@...el.com>
CC: <linux-coco@...ts.linux.dev>, <linux-kernel@...r.kernel.org>,
	<kvm@...r.kernel.org>, <x86@...nel.org>, <reinette.chatre@...el.com>,
	<ira.weiny@...el.com>, <kai.huang@...el.com>, <yilun.xu@...ux.intel.com>,
	<sagis@...gle.com>, <vannapurve@...gle.com>, <paulmck@...nel.org>,
	<nik.borisov@...e.com>, <zhenzhong.duan@...el.com>, <seanjc@...gle.com>,
	<rick.p.edgecombe@...el.com>, <kas@...nel.org>,
	<dave.hansen@...ux.intel.com>, <vishal.l.verma@...el.com>
Subject: Re: [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module
 update expectations

On Mon, Jan 26, 2026 at 02:14:18PM -0800, dan.j.williams@...el.com wrote:
>Chao Gao wrote:
>> In rare cases, TDX Module updates may cause TD management operations to
>> fail if they occur during phases of the TD lifecycle that are sensitive
>> to update compatibility.
>
>No. The TDX Module wants to be able to claim that some updates are
>compatible when they are not. If Linux takes on additional exclusions it
>modestly increases the scope of changes that can be included in an
>update. It is not possible to claim "rare" if module updates routinely
>include that problematic scope.
>
>> But not all combinations of P-SEAMLDR, kernel, and TDX Module have the
>> capability to detect and prevent said incompatibilities. Completely
>> disabling TDX Module updates on platforms without the capability would
>> be overkill, as these incompatibility cases are rare and can be
>> addressed by userspace through coordinated scheduling of updates and TD
>> management operations.
>
>"Completely disabling" is not the tradeoff. The tradeoff is whether or
>not the TDX Module meets Linux compatible update requirements or not.
>
>> To set clear expectations for TDX Module updates, expose the capability
>> to detect and prevent these incompatibility cases via sysfs and
>> document the compatibility criteria and indications when those criteria
>> are violated.
>
>Linux derives no benefit from a "compat_capable" kernel ABI. Yes, the
>internals must export the error condition on collision. I am not
>debating that nor revisiting the decision of pre-update-fail, vs
>post-collision-notify. However, if the module violates the Linux
>expectations that is the module's issue to document or preclude. The
>fact that the compatibility contract is ambiguous to the kernel is a
>feature. It puts the onus squarely on module updates to be documented
>(or tools updated to understand) as meeting or violating Linux
>compatibility expectations.
>
>> Signed-off-by: Chao Gao <chao.gao@...el.com>
>> ---
>> v3:
>>  - new, based on a reference patch from Dan Williams
>
>One of the details that is missing is the protocol (module documentation
>or tooling) to determine ahead of time if an update is compatible. That
>obviates the need for "compat_capable" ABI which serves no long term
>purpose. Specifically, the expectation is "run non-compatible updates at
>your own operational risk".

Agreed. We need to add metadata like crypto library version or equivalent
abstraction to the mapping file. This enables userspace to determine whether
module updates meet Linux compatibility requirements. I'll submit a request
for this metadata.

And actually, userspace can already determine if the TDX module supports
"collision avoidance" by reading the "tdx_features0" field from the mapping
file [1].

[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/mapping_file.json

>
>So, remove "compat_capable" ABI. Amend the "error" ABI documentation
>with the details for avoiding failures and the risk of running updates
>on configurations that support update but not collision avoidance.

Got it. I will modify this patch as follows:

diff --git a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
index a3f155977016..0a68e68375fa 100644
--- a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
+++ b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
@@ -29,3 +29,57 @@ Description:	(RO) Report the number of remaining updates that can be performed.
		4.2 "SEAMLDR.INSTALL" for more information. The documentation is
		available at:
		https://cdrdv2-public.intel.com/739045/intel-tdx-seamldr-interface-specification.pdf
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload
+Contact:	linux-coco@...ts.linux.dev
+Description:	(Directory) The seamldr_upload directory implements the
+		fw_upload sysfs ABI, see
+		Documentation/ABI/testing/sysfs-class-firmware for the general
+		description of the attributes @data, @cancel, @error, @loading,
+		@remaining_size, and @status. This ABI facilitates "Compatible
+		TDX Module Updates". A compatible update is one that meets the
+		following criteria:
+
+		   Does not interrupt or interfere with any current TDX
+		   operation or TD VM.
+
+		   Does not invalidate any previously consumed Module metadata
+		   values outside of the TEE_TCB_SVN_2 field (updated Security
+		   Version Number) in TD Quotes.
+
+		   Does not require validation of new Module metadata fields. By
+		   implication, new Module features and capabilities are only
+		   available by installing the Module at reboot (BIOS or EFI
+		   helper loaded).
+
+		See tdx_host/firmware/seamldr_upload/error for more details.
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload/error
+Contact:	linux-coco@...ts.linux.dev
+Description:	(RO) See Documentation/ABI/testing/sysfs-class-firmware for
+		baseline expectations for this file. The <ERROR> part in the
+		<STATUS>:<ERROR> format can be:
+
+		   "device-busy": Compatibility checks failed or not all CPUs
+		                  are online
+		   "flash-wearout": the number of updates reached the limit.
+		   "read-write-error": Memory allocation failed.
+		   "hw-error": Cannot communicate with P-SEAMLDR or TDX Module
+		   "firmware-invalid": The TDX Module to be installed is invalid
+		                       or other unexpected errors occurred.
+
+		"hw-error" or "firmware-invalid" may be fatal, causing all TDs
+		and the TDX Module to be lost and preventing further TDX
+		operations. This occurs when /sys/devices/faux/tdx_host/version
+		becomes unreadable after update failures. For other errors, TDs
+		and the (previous) TDX Module stay running.
+
+		On certain earlier TDX Module versions, incompatible updates may
+		not trigger "device-busy" errors but instead cause TD
+		attestation failures.
+
+		See version_select_and_load.py [1] documentation for how to
+		detect compatible updates and whether the current platform
+		components catch errors or let them leak and cause potential TD
+		attestation failures.
+		[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/version_select_and_load.py

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ