linux-kernel - Re: [RFC PATCH] KVM: TDX: Decouple TDX init mem region from kvm_gmem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <687e7d042afa_203db29463@iweiny-mobl.notmuch>
Date: Mon, 21 Jul 2025 12:46:44 -0500
From: Ira Weiny <ira.weiny@...el.com>
To: Vishal Annapurve <vannapurve@...gle.com>, Ira Weiny <ira.weiny@...el.com>
CC: Yan Zhao <yan.y.zhao@...el.com>, Sean Christopherson <seanjc@...gle.com>,
	Michael Roth <michael.roth@....com>, <pbonzini@...hat.com>,
	<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<rick.p.edgecombe@...el.com>, <kai.huang@...el.com>,
	<adrian.hunter@...el.com>, <reinette.chatre@...el.com>,
	<xiaoyao.li@...el.com>, <tony.lindgren@...el.com>,
	<binbin.wu@...ux.intel.com>, <dmatlack@...gle.com>,
	<isaku.yamahata@...el.com>, <david@...hat.com>, <ackerleytng@...gle.com>,
	<tabba@...gle.com>, <chao.p.peng@...el.com>
Subject: Re: [RFC PATCH] KVM: TDX: Decouple TDX init mem region from
 kvm_gmem_populate()

Vishal Annapurve wrote:
> On Fri, Jul 18, 2025 at 11:41 AM Ira Weiny <ira.weiny@...el.com> wrote:
> >
> > Vishal Annapurve wrote:
> > > On Fri, Jul 18, 2025 at 2:15 AM Yan Zhao <yan.y.zhao@...el.com> wrote:
> > > >
> > > > On Tue, Jul 15, 2025 at 09:10:42AM +0800, Yan Zhao wrote:
> > > > > On Mon, Jul 14, 2025 at 08:46:59AM -0700, Sean Christopherson wrote:
> > > > > > > >         folio = __kvm_gmem_get_pfn(file, slot, index, &pfn, &is_prepared, &max_order);
> > > > > > > If max_order > 0 is returned, the next invocation of __kvm_gmem_populate() for
> > > > > > > GFN+1 will return is_prepared == true.
> > > > > >
> > > > > > I don't see any reason to try and make the current code truly work with hugepages.
> > > > > > Unless I've misundertood where we stand, the correctness of hugepage support is
> > > > > Hmm. I thought your stand was to address the AB-BA lock issue which will be
> > > > > introduced by huge pages, so you moved the get_user_pages() from vendor code to
> > > > > the common code in guest_memfd :)
> > > > >
> > > > > > going to depend heavily on the implementation for preparedness.  I.e. trying to
> > > > > > make this all work with per-folio granulartiy just isn't possible, no?
> > > > > Ah. I understand now. You mean the right implementation of __kvm_gmem_get_pfn()
> > > > > should return is_prepared at 4KB granularity rather than per-folio granularity.
> > > > >
> > > > > So, huge pages still has dependency on the implementation for preparedness.
> > > > Looks with [3], is_prepared will not be checked in kvm_gmem_populate().
> > > >
> > > > > Will you post code [1][2] to fix non-hugepages first? Or can I pull them to use
> > > > > as prerequisites for TDX huge page v2?
> > > > So, maybe I can use [1][2][3] as the base.
> > > >
> > > > > [1] https://lore.kernel.org/all/aG_pLUlHdYIZ2luh@google.com/
> > > > > [2] https://lore.kernel.org/all/aHEwT4X0RcfZzHlt@google.com/
> > >
> > > IMO, unless there is any objection to [1], it's un-necessary to
> > > maintain kvm_gmem_populate for any arch (even for SNP). All the
> > > initial memory population logic needs is the stable pfn for a given
> > > gfn, which ideally should be available using the standard mechanisms
> > > such as EPT/NPT page table walk within a read KVM mmu lock (This patch
> > > already demonstrates it to be working).
> > >
> > > It will be hard to clean-up this logic once we have all the
> > > architectures using this path.
> >
> > Did you mean to say 'not hard'?
> 
> Let me rephrase my sentence:
> It will be harder to remove kvm_gmem_populate if we punt it to the future.

Indeed if more folks start using it.

Ira