[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2a62badf190717a251d269a6905872b01e8e340.camel@intel.com>
Date: Mon, 11 Aug 2025 22:30:58 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "kas@...nel.org" <kas@...nel.org>
CC: "Gao, Chao" <chao.gao@...el.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
"Huang, Kai" <kai.huang@...el.com>, "bp@...en8.de" <bp@...en8.de>,
"x86@...nel.org" <x86@...nel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "linux-coco@...ts.linux.dev"
<linux-coco@...ts.linux.dev>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>
Subject: Re: [PATCHv2 00/12] TDX: Enable Dynamic PAMT
On Mon, 2025-08-11 at 07:31 +0100, kas@...nel.org wrote:
> > I don't see any other reason for the global spin lock, Kirill was that it?
> > Did
> > you consider also adding a lock per 2MB region, like the refcount? Or any
> > other
> > granularity of lock besides global? Not saying global is definitely the
> > wrong
> > choice, but seems arbitrary if I got the above right.
>
> We have discussed this before[1]. Global locking is problematic when you
> actually hit contention. Let's not complicate things until we actually
> see it. I failed to demonstrate contention without huge pages. With huge
> pages it is even more dubious that we ever see it.
>
> [1]
> https://lore.kernel.org/all/4bb2119a-ff6d-42b6-acf4-86d87b0e9939@intel.com/
Ah, I see.
I just did a test of simultaneously starting 10 VMs with 16GB of ram (non huge
pages) and then shutting them down. I saw 701 contentions on startup, and 53
more on shutdown. Total wait time 2ms. Not horrible but not theoretical either.
But it probably wasn't much of a cacheline bouncing worse case. And I guess this
is on my latest changes not this exact v2, but it shouldn't have changed.
But hmm, it seems Dave's objection about maintaining the lock allocations would
apply to the refcounts too? But the hotplug concerns shouldn't actually be an
issue for TDX because they gets rejected if the allocations are not already
there. So complexity of a per-2MB lock should be minimal, at least
incrementally. The difference seems more about memory use vs performance.
What gives me pause is in the KVM TDX work we have really tried hard to not take
exclusive locks in the shared MMU lock path. Admittedly that wasn't backed by
hard numbers. But an enormous amount of work went into lettings KVM faults
happen under the shared lock for normal VMs. So on one hand, yes it's premature
optimization. But on the other hand, it's a maintainability concern about
polluting the existing way things work in KVM with special TDX properties.
I think we need to at least call out loudly that the decision was to go with the
simplest possible solution, and the impact to KVM. I'm not sure what Sean's
opinion is, but I wouldn't want him to first learn of it when he went digging
and found a buried global spin lock in the fault path.
Powered by blists - more mailing lists