lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aOxLl90ndWP9AinU@yzhao56-desk.sh.intel.com>
Date: Mon, 13 Oct 2025 08:45:11 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Ackerley Tng <ackerleytng@...gle.com>
CC: <pbonzini@...hat.com>, <seanjc@...gle.com>,
	<linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>, <x86@...nel.org>,
	<rick.p.edgecombe@...el.com>, <dave.hansen@...el.com>, <kas@...nel.org>,
	<tabba@...gle.com>, <quic_eberman@...cinc.com>, <michael.roth@....com>,
	<david@...hat.com>, <vannapurve@...gle.com>, <vbabka@...e.cz>,
	<thomas.lendacky@....com>, <pgonda@...gle.com>, <zhiquan1.li@...el.com>,
	<fan.du@...el.com>, <jun.miao@...el.com>, <ira.weiny@...el.com>,
	<isaku.yamahata@...el.com>, <xiaoyao.li@...el.com>,
	<binbin.wu@...ux.intel.com>, <chao.p.peng@...el.com>
Subject: Re: [RFC PATCH v2 17/23] KVM: guest_memfd: Split for punch hole and
 private-to-shared conversion

On Wed, Oct 01, 2025 at 08:00:21AM +0000, Ackerley Tng wrote:
> Yan Zhao <yan.y.zhao@...el.com> writes:
> 
> I was looking deeper into this patch since on my WIP tree I already had
> the invalidate and zap steps separated out and had to do more to rebase
> this patch :)
> 
> > In TDX, private page tables require precise zapping because faulting back
> > the zapped mappings necessitates the guest's re-acceptance.
> 
> I feel that this statement could be better phrased because all zapped
> mappings require re-acceptance, not just anything related to precise
> zapping. Would this be better:
> 
>     On private-to-shared conversions, page table entries must be zapped
>     from the Secure EPTs. Any pages mapped into Secure EPTs must be
>     accepted by the guest before they are used.
> 
>     Hence, care must be taken to only precisely zap ranges requested for
>     private-to-shared conversion, since the guest is only prepared to
>     re-accept precisely the ranges it requested for conversion.
> 
>     The guest may request to convert ranges not aligned with private
>     page table entry boundaries. To precisely zap these ranges, huge
>     leaves that span the boundaries of the requested ranges must be
>     split into smaller leaves, so that the split, smaller leaves now
>     align with the requested range for zapping.
LGTM. Thanks!

> > Therefore,
> > before performing a zap for hole punching and private-to-shared
> > conversions, huge leafs that cross the boundary of the zapping GFN range in
> > the mirror page table must be split.
> >
> > Splitting may result in an error. If this happens, hole punching and
> > private-to-shared conversion should bail out early and return an error to
> > userspace.
> >
> > Splitting is not necessary for kvm_gmem_release() since the entire page
> > table is being zapped, nor for kvm_gmem_error_folio() as an SPTE must not
> > map more than one physical folio.
> >
> 
> I think splitting is not necessary as long as aligned page table entries
> are zapped. Splitting is also not necessary if the entire page table is
> zapped but that's a superset of zapping aligned page table
> entries. (Probably just a typo on your side.) Here's my attempt at
what is the typo you are referring to?

> rephrasing this:
> 
>     Splitting is not necessary for the cases where only aligned page
>     table entries are zapped, such as during kvm_gmem_release() where
By "page table entries", you mean SPTEs, i.e., entries in the secondary MMU,
right?

>     the entire guest_memfd worth of memory is zapped, nor for
>     truncation, where truncation of pages within a huge folio is not
>     allowed.
I think that "splitting is not required for truncation" is valid only based on
KVM's implementation where "an SPTE must not map more than one physical folio".
i.e., the SPTE entry size is <= folio size.

If KVM were implemented differently where one SPTE could cover multiple folios
(similar to IOMMU SLTP entries for shared memory, though this is unlikely to
happen), splitting would still be required before truncation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ