[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8bd4850b0c74fbed531232a4a69603882a5562a1.camel@intel.com>
Date: Wed, 3 Dec 2025 19:59:50 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "Hansen, Dave" <dave.hansen@...el.com>, "nik.borisov@...e.com"
<nik.borisov@...e.com>, "kas@...nel.org" <kas@...nel.org>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Li, Xiaoyao"
<xiaoyao.li@...el.com>, "Huang, Kai" <kai.huang@...el.com>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, "Zhao, Yan Y"
<yan.y.zhao@...el.com>, "Wu, Binbin" <binbin.wu@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"seanjc@...gle.com" <seanjc@...gle.com>, "mingo@...hat.com"
<mingo@...hat.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "Yamahata, Isaku"
<isaku.yamahata@...el.com>, "Annapurve, Vishal" <vannapurve@...gle.com>,
"Gao, Chao" <chao.gao@...el.com>, "bp@...en8.de" <bp@...en8.de>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v4 07/16] x86/virt/tdx: Add tdx_alloc/free_page() helpers
On Wed, 2025-12-03 at 10:21 -0800, Dave Hansen wrote:
> > Thanks Dave. Yes, let's stick to the spec. I'm going to try to pull the
> > loops
> > out too because we can get rid of the union array thing too.
>
> Also, I honestly don't see the problem with just allocating an order-1
> page for this. Yeah, the TDX modules doesn't need physically contiguous
> pages, but it's easier for _us_ to lug them around if they are
> physically contiguous.
We have two spin locks to contend with for these allocations. One is the global
spin lock on the arch/x86 side. In this case, the the pages don't have to be
passed far, like:
tdx_pamt_get(some_page, NULL)
page1 = alloc()
page2 = alloc()
scoped_guard(spinlock, &pamt_lock) {
tdh_phymem_pamt_add(.., page1, page2)
/* Pack into struct */
seamcall()
}
I think it's not too bad?
Then there is the KVM MMU spin lock during the fault path. This lock happens way
up the call chain. It goes something like:
topup_tdx_pages_cache() /* Add order-0 pages for S-EPT page tables and dpamt */
spin_lock()
... many calls ...
order_0_s_ept_page table = alloc_from_order_0_cache();
tdx_sept_link_private_spt(order_0_s_ept_page)
tdx_pamt_get(order_0_s_ept_page, order_0_cache)
/* alloc two pages from order_0_cache for dpamt */
tdx_sept_set_private_spte(guest_page)
tdx_pamt_get(guest_page, order_0_cache)
/* alloc two pages from order_0_cache for dpamt*/
spin_unlock()
So if we decide to pass a single order-1 page into tdx_pamt_get() instead of
order_0_cache, we can stop passing the cache between KVM and arch/x86, but we
then need two cache's instead of one. One for order-0 S-EPT page tables and one
for order-1 DPAMT page pairs.
Also, if we have to allocate the order-1 page in each caller, it simplifies the
arch/x86 code, but duplicates the allocation in the KVM callers (only 2 today
though).
So I'm suspicious it's not going to be a big win, but I'll give it a try.
>
> Plus, if you permanently allocate 2 order-0 pages, you are _probably_
> going to permanently destroy 2 potential future 2MB pages. The order-1
> allocation will only destroy 1.
Doesn't the buddy allocator try to avoid splitting larger blocks? I guess you
mean in the worst case, but the DPAMT should also not be allocated forever
either. So I think it's only at the intersection of two worst cases? Worth it?
Powered by blists - more mailing lists