[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPsuD2fbYwCccgNi@intel.com>
Date: Fri, 24 Oct 2025 15:43:11 +0800
From: Chao Gao <chao.gao@...el.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: Vishal Annapurve <vannapurve@...gle.com>, "Reshetova, Elena"
<elena.reshetova@...el.com>, "linux-coco@...ts.linux.dev"
<linux-coco@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>, "Chatre,
Reinette" <reinette.chatre@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>,
"Huang, Kai" <kai.huang@...el.com>, "Williams, Dan J"
<dan.j.williams@...el.com>, "yilun.xu@...ux.intel.com"
<yilun.xu@...ux.intel.com>, "sagis@...gle.com" <sagis@...gle.com>,
"paulmck@...nel.org" <paulmck@...nel.org>, "nik.borisov@...e.com"
<nik.borisov@...e.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar
<mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>, Paolo Bonzini
<pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support
>One thing I don't think I've heard anyone be worried about is how timely
>the update process is. So how about this: Updates wait for any existing
>builds to complete. But, new builds wait for updates. That can be done
>with a single rwsem:
>
>struct rw_semaphore update_rwsem;
>
>tdx_td_init()
>{
> ...
>+ down_read_interruptible(&update_rwsem);
> kvm_tdx->state = TD_STATE_INITIALIZED;
>
>tdx_td_finalize()
>{
> ...
>+ up_read(&update_rwsem);
> kvm_tdx->state = TD_STATE_RUNNABLE;
>
>A module update does:
>
> down_write_interruptible(&update_rwsem);
> do_actual_update();
> up_write(&update_rwsem);
>
>There would be no corruption issues, no erroring out of the build
>process, and no punting to userspace to ensure forward progress.
>
>The big downside is that both the build process and update process can
>appear to hang for a long time. It'll also be a bit annoying to ensure
>that there are up_read(&update_rwsem)'s if the kvm_tdx object gets torn
>down during a build.
>
>But the massive upside is that there's no new ABI and all the
>consistency and forward progress guarantees are in the kernel. If we
>want new ABIs around it that give O_NONBLOCK semantics to build or
>update, that can be added on after the fact.
>
>Plus, if userspace *WANTS* to coordinate the whole shebang, they're free
>to. They'd never see long hangs because they would be coordinating.
>
>Thoughts?
Hi Dave,
Thanks for this summary and suggestion.
Beyond "the kvm_tdx object gets torn down during a build," I see two potential
issues:
1. TD Build and TDX migration aren't purely kernel processes -- they span multiple
KVM ioctls. Holding a read-write lock throughout the entire process would
require exiting to userspace while the lock is held. I think this is
irregular, but I'm not sure if it's acceptable for read-write semaphores.
2. The kernel may need to hold this read-write lock for operations not yet
defined in the future. The TDX Module Base spec [*] notes on page 55:
: Future TDX Module versions may have different or additional update-sensitive
: cases. By design, such cases apply to a small portion of the overall TD
: lifecycle.
[*]: https://cdrdv2.intel.com/v1/dl/getContent/733575
Given these concerns, I'm not sure whether implementing a read-write lock in
the kernel is the right approach.
Since Google prefers to "avoid updates during update-sensitive times," we can
implement that approach for now. If other Linux users find this insufficient
and prefer failing TD build/migration operations with strong justification, we
can enable that functionality in the future.
What do you think?
Powered by blists - more mailing lists