linux-kernel - Re: [PATCH v2 00/21] Runtime TDX Module update support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aad8ae43-a7bd-42b2-9452-2bdee82bf0d8@intel.com>
Date: Thu, 23 Oct 2025 14:10:46 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Vishal Annapurve <vannapurve@...gle.com>, Chao Gao <chao.gao@...el.com>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>,
 "linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "x86@...nel.org" <x86@...nel.org>,
 "Chatre, Reinette" <reinette.chatre@...el.com>,
 "Weiny, Ira" <ira.weiny@...el.com>, "Huang, Kai" <kai.huang@...el.com>,
 "Williams, Dan J" <dan.j.williams@...el.com>,
 "yilun.xu@...ux.intel.com" <yilun.xu@...ux.intel.com>,
 "sagis@...gle.com" <sagis@...gle.com>,
 "paulmck@...nel.org" <paulmck@...nel.org>,
 "nik.borisov@...e.com" <nik.borisov@...e.com>, Borislav Petkov
 <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
 "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
 "Kirill A. Shutemov" <kas@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>,
 "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
 Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support

On 10/23/25 13:31, Vishal Annapurve wrote:
...
>> Admin logic to update TDX modules can be designed to either retry
>> failed TDX module updates or to be more robust, adds some
>> synchronization with VM creation attempts on the host. i.e. I think
>> it's fine to punt this problem of ensuring the forward progress to
>> user-space admin logic on the host.
> Discussed offline with Erdem Aktas on this. From Google's perspective
> "Avoid updates during updatesensitive times" seems a better option as
> I mentioned above.
> 
> To avoid having to choose which policy to enforce in kernel, a better
> way could be to:
> * Allow user space opt-in for "Avoid updates during updatesensitive times" AND
> * Allow user space opt-in for "Detect incompatibility after update" as well OR
> * Keep "Detect incompatibility after update" enabled by default based
> on the appetite for avoiding silent corruption scenarios.

I'd really prefer to keep this simple. Adding new opt-in ABIs up the
wazoo doesn't seem great.

I think I've heard three requirements in the end:

1. Guarantee module update forward progress
2. Avoid "corrupt" TD build processes by letting the build/update
   race happen
3. Don't complicate the build process by forcing it to error out
   if a module update clobbers a build

One thing I don't think I've heard anyone be worried about is how timely
the update process is. So how about this: Updates wait for any existing
builds to complete. But, new builds wait for updates. That can be done
with a single rwsem:

struct rw_semaphore update_rwsem;

tdx_td_init()
{
	...
+	down_read_interruptible(&update_rwsem);
	kvm_tdx->state = TD_STATE_INITIALIZED;

tdx_td_finalize()
{
	...
+	up_read(&update_rwsem);
	kvm_tdx->state = TD_STATE_RUNNABLE;

A module update does:

	down_write_interruptible(&update_rwsem);
	do_actual_update();
	up_write(&update_rwsem);

There would be no corruption issues, no erroring out of the build
process, and no punting to userspace to ensure forward progress.

The big downside is that both the build process and update process can
appear to hang for a long time. It'll also be a bit annoying to ensure
that there are up_read(&update_rwsem)'s if the kvm_tdx object gets torn
down during a build.

But the massive upside is that there's no new ABI and all the
consistency and forward progress guarantees are in the kernel. If we
want new ABIs around it that give O_NONBLOCK semantics to build or
update, that can be added on after the fact.

Plus, if userspace *WANTS* to coordinate the whole shebang, they're free
to. They'd never see long hangs because they would be coordinating.

Thoughts?