[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68fbd63450c7c_10e910021@dwillia2-mobl4.notmuch>
Date: Fri, 24 Oct 2025 12:40:36 -0700
From: <dan.j.williams@...el.com>
To: Dave Hansen <dave.hansen@...el.com>, Chao Gao <chao.gao@...el.com>
CC: Vishal Annapurve <vannapurve@...gle.com>, "Reshetova, Elena"
<elena.reshetova@...el.com>, "linux-coco@...ts.linux.dev"
<linux-coco@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>, "Chatre,
Reinette" <reinette.chatre@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>,
"Huang, Kai" <kai.huang@...el.com>, "Williams, Dan J"
<dan.j.williams@...el.com>, "yilun.xu@...ux.intel.com"
<yilun.xu@...ux.intel.com>, "sagis@...gle.com" <sagis@...gle.com>,
"paulmck@...nel.org" <paulmck@...nel.org>, "nik.borisov@...e.com"
<nik.borisov@...e.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar
<mingo@...hat.com>, "Kirill A. Shutemov" <kas@...nel.org>, Paolo Bonzini
<pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support
Dave Hansen wrote:
> On 10/24/25 00:43, Chao Gao wrote:
> ...
> > Beyond "the kvm_tdx object gets torn down during a build," I see two potential
> > issues:
> >
> > 1. TD Build and TDX migration aren't purely kernel processes -- they span multiple
> > KVM ioctls. Holding a read-write lock throughout the entire process would
> > require exiting to userspace while the lock is held. I think this is
> > irregular, but I'm not sure if it's acceptable for read-write semaphores.
>
> Sure, I guess it's irregular. But look at it this way: let's say we
> concocted some scheme to use a TD build refcount and a module update
> flag, had them both wait_event_interruptible() on each other, and then
> did wakeups. That would get the same semantics without an rwsem.
This sounds unworkable to me.
First, you cannot return to userspace while holding a lock. Lockdep will
rightfully scream:
"WARNING: lock held when returning to user space!"
The complexity of ensuring that a multi-stage ABI transaction completes
from the kernel side is painful. If that process dies in the middle of
its ABI sequence who cleans up these references?
The operational mechanism to make sure that one process flow does not
mess up another process flow is for those process to communicate with
*userspace* file locks, or for those process to check for failures after
the fact and retry. Unless you can make the build side an atomic ABI,
this is a documentation + userspace problem, not a kernel problem.
Powered by blists - more mailing lists