linux-kernel - Re: [PATCH v3 00/24] KVM: TDX huge page support for private memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aWgyhmTJphGQqO0Y@google.com>
Date: Wed, 14 Jan 2026 16:19:18 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Yan Zhao <yan.y.zhao@...el.com>, Ackerley Tng <ackerleytng@...gle.com>, 
	Vishal Annapurve <vannapurve@...gle.com>, pbonzini@...hat.com, linux-kernel@...r.kernel.org, 
	kvm@...r.kernel.org, x86@...nel.org, rick.p.edgecombe@...el.com, 
	kas@...nel.org, tabba@...gle.com, michael.roth@....com, david@...nel.org, 
	sagis@...gle.com, vbabka@...e.cz, thomas.lendacky@....com, 
	nik.borisov@...e.com, pgonda@...gle.com, fan.du@...el.com, jun.miao@...el.com, 
	francescolavra.fl@...il.com, jgross@...e.com, ira.weiny@...el.com, 
	isaku.yamahata@...el.com, xiaoyao.li@...el.com, kai.huang@...el.com, 
	binbin.wu@...ux.intel.com, chao.p.peng@...el.com, chao.gao@...el.com
Subject: Re: [PATCH v3 00/24] KVM: TDX huge page support for private memory

On Wed, Jan 14, 2026, Dave Hansen wrote:
> On 1/14/26 07:26, Sean Christopherson wrote:
> ...
> > Dave may feel differently, but I am not going to budge on this.  I am not going
> > to bake in assumptions throughout KVM about memory being backed by page+folio.
> > We _just_ cleaned up that mess in the aformentioned "Stop grabbing references to
> > PFNMAP'd pages" series, I am NOT reintroducing such assumptions.
> > 
> > NAK to any KVM TDX code that pulls a page or folio out of a guest_memfd pfn.
> 
> 'struct page' gives us two things: One is the type safety, but I'm
> pretty flexible on how that's implemented as long as it's not a raw u64
> getting passed around everywhere.

I don't necessarily disagree on the type safety front, but for the specific code
in question, any type safety is a facade.  Everything leading up to the TDX code
is dealing with raw PFNs and/or PTEs.  Then the TDX code assumes that the PFN
being mapped into the guest is backed by a struct page, and that the folio size
is consistent with @level, without _any_ checks whatsover.  This is providing
the exact opposite of safety.

  static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn,
			    enum pg_level level, kvm_pfn_t pfn)
  {
	int tdx_level = pg_level_to_tdx_sept_level(level);
	struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
	struct page *page = pfn_to_page(pfn);    <==================
	struct folio *folio = page_folio(page);
	gpa_t gpa = gfn_to_gpa(gfn);
	u64 entry, level_state;
	u64 err;

	err = tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, folio,
			       folio_page_idx(folio, page), &entry, &level_state);

	...
  }

I've no objection if e.g. tdh_mem_page_aug() wants to sanity check that a PFN
is backed by a struct page with a valid refcount, it's code like that above that
I don't want.

> The second thing is a (near) guarantee that the backing memory is RAM.
> Not only RAM, but RAM that the TDX module knows about and has a PAMT and
> TDMR and all that TDX jazz.

I'm not at all opposed to backing guest_memfd with "struct page", quite the
opposite.  What I don't want is to bake assumptions into KVM code that doesn't
_require_ struct page, because that has cause KVM immense pain in the past.

And I'm strongly opposed to KVM special-casing TDX or anything else, precisely
 because we struggled through all that pain so that KVM would work better with
memory that isn't backed by "struct page", or more specifically, memory that has
an associated "struct page", but isn't managed by core MM, e.g. isn't refcounted.

> We've also done things like stopping memory hotplug because you can't
> amend TDX page metadata at runtime. So we prevent new 'struct pages'
> from coming into existence. So 'struct page' is a quite useful choke
> point for TDX.
> 
> I'd love to hear more about how guest_memfd is going to tie all the
> pieces together and give the same straightforward guarantees without
> leaning on the core mm the same way we do now.

I don't think guest_memfd needs to be different, and that's not what I'm advocating.
What I don't want is to make KVM TDX's handling of memory different from the rest
of KVM and KVM's MMU(s).