lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAhR5DFXAmgDKy-5dvUCK0AU9Jk58A8mB+N63FTb2oFto_bTpg@mail.gmail.com>
Date: Mon, 24 Nov 2025 14:18:22 -0600
From: Sagi Shahar <sagis@...gle.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>
Cc: bp@...en8.de, chao.gao@...el.com, dave.hansen@...el.com, 
	isaku.yamahata@...el.com, kai.huang@...el.com, kas@...nel.org, 
	kvm@...r.kernel.org, linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org, 
	mingo@...hat.com, pbonzini@...hat.com, seanjc@...gle.com, tglx@...utronix.de, 
	vannapurve@...gle.com, x86@...nel.org, yan.y.zhao@...el.com, 
	xiaoyao.li@...el.com, binbin.wu@...el.com
Subject: Re: [PATCH v4 00/16] TDX: Enable Dynamic PAMT

On Thu, Nov 20, 2025 at 6:51 PM Rick Edgecombe
<rick.p.edgecombe@...el.com> wrote:
>
> Hi,
>
> This is 4th revision of Dynamic PAMT, which is a new feature that reduces
> the memory use of TDX. For background information, see the v3
> coverletter[0]. V4 mostly consists of incorporating the feedback from v3.
> Notably what it *doesn’t* change is the solution for pre-allocating DPAMT
> backing pages. (more info below in “Changes”)
>
> I think this series isn’t quite ready to ask for merging just yet. I’d
> appreciate another round of review especially looking for any issues in the
> refcount allocation/mapping and the pamt get/put lock-refcount dance. And
> hopefully collect some RBs.
>
> Sean/Paolo, we have mostly banished this all to TDX code except for “KVM:
> TDX: Add x86 ops for external spt cache” patch. That and “x86/virt/tdx:
> Add helpers to allow for pre-allocating pages” are probably the remaining
> possibly controversial parts of your domain. If you only have a short time
> to spend at this point, I’d point you at those two patches.
>
> Since most of the changes are in arch/x86, I’d think this feature could be
> a candidate for eventually merging through tip with Sean or Paolo’s ack.
> But it is currently based on kvm-x86/next in order to build on top of the
> post-populate cleanup series. Next time I’ll probably target tip if people
> think that is a good way forward.
>
> Changes
> =======
> There were two good suggestions around the pre-allocated pages solution
> last time, but neither ended up working out:
>
> 1. Dave suggested to use mempool_t instead of the linked list based
> structure, in order to not re-invent the wheel. This turned out to not
> quite fit. The problems were that there wasn’t really a “topup” mechanism,
> or an atomic fallback (which matches the kvm cache behavior). This results
> in very similar code being built around mempool that was built around the
> linked list. It was an overall harder to follow solution for not much code
> savings.
>
> I strongly considered going back to Kiryl’s original solution which passed
> a callback function pointer for allocating DPAMT pages, and an opaque
> void * that the callback could use to find the kvm_mmu_memory_cache. I
> thought that readability issues of passing the opaque void * between
> subsystems outweighed the small code duplication in the simple, familiar
> patterned linked list-based code. So I ended up leaving it.
>
> 2. Kai suggested (but later retracted the idea) that since the external
> page table cache was moved to TDX code, it could simply install DPAMT
> pages for the cache at topup time. Then the installation of DPAMT backing
> for S-EPT page tables could be done outside of the mmu_lock. It could also
> be argued that it makes the design simpler in a way, because the external
> page table cache acts like it did before. Anything in there could be simply
> used.
>
> At the time my argument against this was that whether a huge page would be
> installed (and thus, whether DPAMT backing was needed) for the guest
> private memory would not be known until later, so early install solution
> would need special late handling for TDX huge pages. After some internal
> discussions I at looked how we could simplify the series by punting on TDX
> huge pages needs.
>
> But it turns out that this other design was actually more complex and had
> more LOC than the previous solution. So it was dropped, and again, I went
> back to the original solution.
>
>
> I’m really starting to think that, while the overall solution here isn’t
> the most elegant, we might not have much more to squeeze from it. So
> design-wise, I think we should think about calling it done.
>
> Testing
> =======
> Based on kvm-x86/next (4531ff85d925). Testing was the usual, except I also
> tested with TDX modules that don't support DPAMT, and with the two
> optimization patches removed: “Improve PAMT refcounters allocation for
> sparse memory” and “x86/virt/tdx: Optimize tdx_alloc/free_page() helpers”.
>
> [0] https://lore.kernel.org/kvm/20250918232224.2202592-1-rick.p.edgecombe@intel.com/
>
>
> Kirill A. Shutemov (13):
>   x86/tdx: Move all TDX error defines into <asm/shared/tdx_errno.h>
>   x86/tdx: Add helpers to check return status codes
>   x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
>   x86/virt/tdx: Allocate reference counters for PAMT memory
>   x86/virt/tdx: Improve PAMT refcounts allocation for sparse memory
>   x86/virt/tdx: Add tdx_alloc/free_page() helpers
>   x86/virt/tdx: Optimize tdx_alloc/free_page() helpers
>   KVM: TDX: Allocate PAMT memory for TD control structures
>   KVM: TDX: Allocate PAMT memory for vCPU control structures
>   KVM: TDX: Handle PAMT allocation in fault path
>   KVM: TDX: Reclaim PAMT memory
>   x86/virt/tdx: Enable Dynamic PAMT
>   Documentation/x86: Add documentation for TDX's Dynamic PAMT
>
> Rick Edgecombe (3):
>   x86/virt/tdx: Simplify tdmr_get_pamt_sz()
>   KVM: TDX: Add x86 ops for external spt cache
>   x86/virt/tdx: Add helpers to allow for pre-allocating pages
>
>  Documentation/arch/x86/tdx.rst              |  21 +
>  arch/x86/coco/tdx/tdx.c                     |  10 +-
>  arch/x86/include/asm/kvm-x86-ops.h          |   3 +
>  arch/x86/include/asm/kvm_host.h             |  14 +-
>  arch/x86/include/asm/shared/tdx.h           |   8 +
>  arch/x86/include/asm/shared/tdx_errno.h     | 104 ++++
>  arch/x86/include/asm/tdx.h                  |  78 ++-
>  arch/x86/include/asm/tdx_global_metadata.h  |   1 +
>  arch/x86/kvm/mmu/mmu.c                      |   6 +-
>  arch/x86/kvm/mmu/mmu_internal.h             |   2 +-
>  arch/x86/kvm/vmx/tdx.c                      | 160 ++++--
>  arch/x86/kvm/vmx/tdx.h                      |   3 +-
>  arch/x86/kvm/vmx/tdx_errno.h                |  40 --
>  arch/x86/virt/vmx/tdx/tdx.c                 | 587 +++++++++++++++++---
>  arch/x86/virt/vmx/tdx/tdx.h                 |   5 +-
>  arch/x86/virt/vmx/tdx/tdx_global_metadata.c |   7 +
>  16 files changed, 854 insertions(+), 195 deletions(-)
>  create mode 100644 arch/x86/include/asm/shared/tdx_errno.h
>  delete mode 100644 arch/x86/kvm/vmx/tdx_errno.h
>
> --
> 2.51.2
>
>

Tested-by: Sagi Shahar <sagis@...gle.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ