lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24665176b1e6b169441c9f6db9b5d02d073377a4.camel@intel.com>
Date: Tue, 20 Jan 2026 19:53:19 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-coco@...ts.linux.dev"
	<linux-coco@...ts.linux.dev>, "Huang, Kai" <kai.huang@...el.com>, "Li,
 Xiaoyao" <xiaoyao.li@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
	"Zhao, Yan Y" <yan.y.zhao@...el.com>, "Wu, Binbin" <binbin.wu@...el.com>,
	"kas@...nel.org" <kas@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>, "tglx@...utronix.de"
	<tglx@...utronix.de>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
	"Annapurve, Vishal" <vannapurve@...gle.com>, "Gao, Chao"
	<chao.gao@...el.com>, "bp@...en8.de" <bp@...en8.de>, "x86@...nel.org"
	<x86@...nel.org>
Subject: Re: [PATCH v4 11/16] KVM: TDX: Add x86 ops for external spt cache

Sean, really appreciate you taking a look despite being overbooked.

On Fri, 2026-01-16 at 16:53 -0800, Sean Christopherson wrote:
> NAK.  I kinda sorta get why you did this?  But the pages KVM uses for page tables
> are KVM's, not to be mixed with PAMT pages.
> 
> Eww.  Definitely a hard "no".  In tdp_mmu_alloc_sp_for_split(), the allocation
> comes from KVM:
> 
> 	if (mirror) {
> 		sp->external_spt = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
> 		if (!sp->external_spt) {
> 			free_page((unsigned long)sp->spt);
> 			kmem_cache_free(mmu_page_header_cache, sp);
> 			return NULL;
> 		}
> 	}

Ah, this is from the TDX huge pages series. There is a bit of fallout from TDX 
/coco's eternal nemesis: stacks of code all being co-designed at once.

Dave has been directing us recently to focus on only the needs of the current
series. Now that we can test at each incremental step we don't have the same
problems as before. But of course there is still desire for updated TDX huge
pages, etc to help with development of all the other WIP stuff.

For this design aspect of how the topup caches work for DPAMT, he asked
specifically for the DPAMT patches to *not* consider how TDX huge pages will use
them.

Now the TDX huge pages coverletter asked you to look at some aspects of that,
and traditionally KVM side has preferred to look at how the code is all going to
work together. The presentation of this was a bit rushed and confused, but
looking forward, how do you want to do this?

After the 130 patches ordeal, I'm a bit amenable to Dave's view. What do you
think?

> 
> But then in kvm_tdp_mmu_map(), via kvm_mmu_alloc_external_spt(), the allocation
> comes from get_tdx_prealloc_page()
> 
>   static void *tdx_alloc_external_fault_cache(struct kvm_vcpu *vcpu)
>   {
> 	struct page *page = get_tdx_prealloc_page(&to_tdx(vcpu)->prealloc);
> 
> 	if (WARN_ON_ONCE(!page))
> 		return (void *)__get_free_page(GFP_ATOMIC | __GFP_ACCOUNT);
> 
> 	return page_address(page);
>   }
> 
> But then regardles of where the page came from, KVM frees it.  Seriously.
> 
>   static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
>   {
> 	free_page((unsigned long)sp->external_spt);  <=====

Ah! Ok, we might have a good option here. Kai had suggested that instead of pre-
faulting all the pages that we might need for DPAMT backing, we pre-install the
DPAMT backing for all pages in the external page table cache at top-up time. It
has a complication for TDX huge pages, which needs to decide late if a page is
huge and needs dpamt backing or not, but based on Dave's direction above I gave
it a try. A downside for pure DPAMT was that it needed to handle the freeing
specially because it needed to potentially reclaim the DPAMT backing. This
required an x86 op for freeing these pages, and was a bit more code on the KVM
side in general.

But if the asymmetry is a problem either way, maybe I'll give it a try for v5. 

> 	free_page((unsigned long)sp->spt);
> 	kmem_cache_free(mmu_page_header_cache, sp);
>   }
> 
> Oh, and the hugepage series also fumbles its topup (why there's yet another
> topup API, I have no idea).
> 
>   static int tdx_topup_vm_split_cache(struct kvm *kvm, enum pg_level level)
>   {
> 	struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> 	struct tdx_prealloc *prealloc = &kvm_tdx->prealloc_split_cache;
> 	int cnt = tdx_min_split_cache_sz(kvm, level);
> 
> 	while (READ_ONCE(prealloc->cnt) < cnt) {
> 		struct page *page = alloc_page(GFP_KERNEL);  <==== GFP_KERNEL_ACCOUNT
> 
> 		if (!page)
> 			return -ENOMEM;
> 
> 		spin_lock(&kvm_tdx->prealloc_split_cache_lock);
> 		list_add(&page->lru, &prealloc->page_list);
> 		prealloc->cnt++;
> 		spin_unlock(&kvm_tdx->prealloc_split_cache_lock);
> 	}
> 
> 	return 0;
>   }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ