[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <diqzecrn2gru.fsf@google.com>
Date: Wed, 01 Oct 2025 08:00:21 +0000
From: Ackerley Tng <ackerleytng@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>, pbonzini@...hat.com, seanjc@...gle.com
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org, x86@...nel.org,
rick.p.edgecombe@...el.com, dave.hansen@...el.com, kas@...nel.org,
tabba@...gle.com, quic_eberman@...cinc.com, michael.roth@....com,
david@...hat.com, vannapurve@...gle.com, vbabka@...e.cz,
thomas.lendacky@....com, pgonda@...gle.com, zhiquan1.li@...el.com,
fan.du@...el.com, jun.miao@...el.com, ira.weiny@...el.com,
isaku.yamahata@...el.com, xiaoyao.li@...el.com, binbin.wu@...ux.intel.com,
chao.p.peng@...el.com, yan.y.zhao@...el.com
Subject: Re: [RFC PATCH v2 17/23] KVM: guest_memfd: Split for punch hole and
private-to-shared conversion
Yan Zhao <yan.y.zhao@...el.com> writes:
I was looking deeper into this patch since on my WIP tree I already had
the invalidate and zap steps separated out and had to do more to rebase
this patch :)
> In TDX, private page tables require precise zapping because faulting back
> the zapped mappings necessitates the guest's re-acceptance.
I feel that this statement could be better phrased because all zapped
mappings require re-acceptance, not just anything related to precise
zapping. Would this be better:
On private-to-shared conversions, page table entries must be zapped
from the Secure EPTs. Any pages mapped into Secure EPTs must be
accepted by the guest before they are used.
Hence, care must be taken to only precisely zap ranges requested for
private-to-shared conversion, since the guest is only prepared to
re-accept precisely the ranges it requested for conversion.
The guest may request to convert ranges not aligned with private
page table entry boundaries. To precisely zap these ranges, huge
leaves that span the boundaries of the requested ranges must be
split into smaller leaves, so that the split, smaller leaves now
align with the requested range for zapping.
> Therefore,
> before performing a zap for hole punching and private-to-shared
> conversions, huge leafs that cross the boundary of the zapping GFN range in
> the mirror page table must be split.
>
> Splitting may result in an error. If this happens, hole punching and
> private-to-shared conversion should bail out early and return an error to
> userspace.
>
> Splitting is not necessary for kvm_gmem_release() since the entire page
> table is being zapped, nor for kvm_gmem_error_folio() as an SPTE must not
> map more than one physical folio.
>
I think splitting is not necessary as long as aligned page table entries
are zapped. Splitting is also not necessary if the entire page table is
zapped but that's a superset of zapping aligned page table
entries. (Probably just a typo on your side.) Here's my attempt at
rephrasing this:
Splitting is not necessary for the cases where only aligned page
table entries are zapped, such as during kvm_gmem_release() where
the entire guest_memfd worth of memory is zapped, nor for
truncation, where truncation of pages within a huge folio is not
allowed.
> Therefore, in this patch,
> - break kvm_gmem_invalidate_begin_and_zap() into
> kvm_gmem_invalidate_begin() and kvm_gmem_zap() and have
> kvm_gmem_release() and kvm_gmem_error_folio() to invoke them.
>
> - have kvm_gmem_punch_hole() to invoke kvm_gmem_invalidate_begin(),
> kvm_gmem_split_private(), and kvm_gmem_zap().
> Bail out if kvm_gmem_split_private() returns error.
>
> - drop the old kvm_gmem_unmap_private() and have private-to-shared
> conversion to invoke kvm_gmem_split_private() and kvm_gmem_zap() instead.
> Bail out if kvm_gmem_split_private() returns error.
>
> Co-developed-by: Ackerley Tng <ackerleytng@...gle.com>
> Signed-off-by: Ackerley Tng <ackerleytng@...gle.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@...el.com>
> ---
> RFC v2:
> - Rebased to [1]. As changes in this patch are gmem specific, they may need
> to be updated if the implementation in [1] changes.
> - Update kvm_split_boundary_leafs() to kvm_split_cross_boundary_leafs() and
> invoke it before kvm_gmem_punch_hole() and private-to-shared conversion.
>
> [1] https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/
>
> RFC v1:
> - new patch.
>
> [...snip...]
>
Powered by blists - more mailing lists