linux-kernel - Re: [PATCH v3 00/24] KVM: TDX huge page support for private memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGtprH-eEUzHDUB0CK2V162HHqvE8kT3bAacb6d3xDYJPwBiYA@mail.gmail.com>
Date: Tue, 6 Jan 2026 09:47:19 -0800
From: Vishal Annapurve <vannapurve@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: pbonzini@...hat.com, seanjc@...gle.com, linux-kernel@...r.kernel.org, 
	kvm@...r.kernel.org, x86@...nel.org, rick.p.edgecombe@...el.com, 
	dave.hansen@...el.com, kas@...nel.org, tabba@...gle.com, 
	ackerleytng@...gle.com, michael.roth@....com, david@...nel.org, 
	sagis@...gle.com, vbabka@...e.cz, thomas.lendacky@....com, 
	nik.borisov@...e.com, pgonda@...gle.com, fan.du@...el.com, jun.miao@...el.com, 
	francescolavra.fl@...il.com, jgross@...e.com, ira.weiny@...el.com, 
	isaku.yamahata@...el.com, xiaoyao.li@...el.com, kai.huang@...el.com, 
	binbin.wu@...ux.intel.com, chao.p.peng@...el.com, chao.gao@...el.com
Subject: Re: [PATCH v3 00/24] KVM: TDX huge page support for private memory

On Tue, Jan 6, 2026 at 2:19 AM Yan Zhao <yan.y.zhao@...el.com> wrote:
>
> - EPT mapping size and folio size
>
>   This series is built upon the rule in KVM that the mapping size in the
>   KVM-managed secondary MMU is no larger than the backend folio size.
>
>   Therefore, there are sanity checks in the SEAMCALL wrappers in patches
>   1-5 to follow this rule, either in tdh_mem_page_aug() for mapping
>   (patch 1) or in tdh_phymem_page_wbinvd_hkid() (patch 3),
>   tdx_quirk_reset_folio() (patch 4), tdh_phymem_page_reclaim() (patch 5)
>   for unmapping.
>
>   However, as Vishal pointed out in [7], the new hugetlb-based guest_memfd
>   [1] splits backend folios ahead of notifying KVM for unmapping. So, this
>   series also relies on the fixup patch [8] to notify KVM of unmapping
>   before splitting the backend folio during the memory conversion ioctl.

I think the major issue here is that if splitting fails there is no
way to undo the unmapping [1]. How should KVM/VMM/guest handle the
case where a guest requested conversion to shared, the conversion
failed and the memory is no longer mapped as private?

[1] https://lore.kernel.org/kvm/aN8P87AXlxlEDdpP@google.com/

> Four issues are identified in the WIP hugetlb-based guest_memfd [1]:
>
> (1) Compilation error due to missing symbol export of
>     hugetlb_restructuring_free_folio().
>
> (2) guest_memfd splits backend folios when the folio is still mapped as
>     huge in KVM (which breaks KVM's basic assumption that EPT mapping size
>     should not exceed the backend folio size).
>
> (3) guest_memfd is incapable of merging folios to huge for
>     shared-to-private conversions.
>
> (4) Unnecessary disabling huge private mappings when HVA is not 2M-aligned,
>     given that shared pages can only be mapped at 4KB.
>
> So, this series also depends on the four fixup patches included in [4]:
>
> [FIXUP] KVM: guest_memfd: Allow gmem slot lpage even with non-aligned uaddr
> [FIXUP] KVM: guest_memfd: Allow merging folios after to-private conversion
> [FIXUP] KVM: guest_memfd: Zap mappings before splitting backend folios
> [FIXUP] mm: hugetlb_restructuring: Export hugetlb_restructuring_free_folio()
>
> (lkp sent me some more gmem compilation errors. I ignored them as I didn't
>  encounter them with my config and env).
>
> ...
>
> [0] RFC v2: https://lore.kernel.org/all/20250807093950.4395-1-yan.y.zhao@intel.com
> [1] hugetlb-based gmem: https://github.com/googleprodkernel/linux-cc/tree/wip-gmem-conversions-hugetlb-restructuring-12-08-25
> [2] gmem-population rework v2: https://lore.kernel.org/all/20251215153411.3613928-1-michael.roth@amd.com
> [3] DPAMT v4: https://lore.kernel.org/kvm/20251121005125.417831-1-rick.p.edgecombe@intel.com
> [4] kernel full stack: https://github.com/intel-staging/tdx/tree/huge_page_v3
> [5] https://lore.kernel.org/all/aF0Kg8FcHVMvsqSo@yzhao56-desk.sh.intel.com
> [6] https://lore.kernel.org/all/aGSoDnODoG2%2FpbYn@yzhao56-desk.sh.intel.com
> [7] https://lore.kernel.org/all/CAGtprH9vdpAGDNtzje=7faHBQc9qTSF2fUEGcbCkfJehFuP-rw@mail.gmail.com
> [8] https://github.com/intel-staging/tdx/commit/a8aedac2df44e29247773db3444bc65f7100daa1
> [9] https://github.com/intel-staging/tdx/commit/8747667feb0b37daabcaee7132c398f9e62a6edd
> [10] https://github.com/intel-staging/tdx/commit/ab29a85ec2072393ab268e231c97f07833853d0d
> [11] https://github.com/intel-staging/tdx/commit/4feb6bf371f3a747b71fc9f4ded25261e66b8895
>