[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7927271c-61e6-4f90-9127-c855a92fe766@intel.com>
Date: Fri, 26 Sep 2025 09:11:54 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>
Cc: "Gao, Chao" <chao.gao@...el.com>, "seanjc@...gle.com"
<seanjc@...gle.com>, "Huang, Kai" <kai.huang@...el.com>,
"kas@...nel.org" <kas@...nel.org>, "Annapurve, Vishal"
<vannapurve@...gle.com>, "bp@...en8.de" <bp@...en8.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v3 00/16] TDX: Enable Dynamic PAMT
On 9/26/25 09:02, Edgecombe, Rick P wrote:
> On Fri, 2025-09-26 at 07:09 -0700, Dave Hansen wrote:
>> On 9/25/25 19:28, Yan Zhao wrote:
>>>> Lastly, Yan raised some last minute doubts internally about TDX
>>>> module locking contention. I’m not sure there is a problem, but
>>>> we
>>>> can come to an agreement as part of the review.
>>> Yes, I found a contention issue that prevents us from dropping the
>>> global lock. I've also written a sample test that demonstrates this
>>> contention.
>> But what is the end result when this contention happens? Does
>> everyone livelock?
>
> You get a TDX_OPERAND_BUSY error code returned. Inside the TDX module
> each lock is a try lock. The TDX module tries to take a sequence of
> locks and if it meets any contention it will release them all and
> return the TDX_OPERAND_BUSY. Some paths in KVM cannot handle failure
> and so don't have a way to handle the error.
>
> So another option to handling this is just to retry until you succeed.
> Then you have a very strange spinlock with a heavyweight inner loop.
> But since each time the locks are released, some astronomical bad
> timing might prevent forward progress. On the KVM side we have avoided
> looping. Although, I think we have not exhausted this path.
If it can't return failure then the _only_ other option is to spin. Right?
I understand the reluctance to have such a nasty spin loop. But other
than reworking the KVM code to do the retries at a higher level, is
there another option?
Powered by blists - more mailing lists