lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJtolM_59M5xVxcY@google.com>
Date: Tue, 12 Aug 2025 09:15:16 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Rick P Edgecombe <rick.p.edgecombe@...el.com>
Cc: "kas@...nel.org" <kas@...nel.org>, Vishal Annapurve <vannapurve@...gle.com>, Chao Gao <chao.gao@...el.com>, 
	"x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>, Kai Huang <kai.huang@...el.com>, 
	"mingo@...hat.com" <mingo@...hat.com>, Yan Y Zhao <yan.y.zhao@...el.com>, 
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>, 
	"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>, 
	"tglx@...utronix.de" <tglx@...utronix.de>, Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [PATCHv2 00/12] TDX: Enable Dynamic PAMT

On Tue, Aug 12, 2025, Rick P Edgecombe wrote:
> On Tue, 2025-08-12 at 09:04 +0100, kas@...nel.org wrote:
> > > > E.g. for things like TDCS pages and to some extent non-leaf S-EPT
> > > > pages, on-demand PAMT management seems reasonable.  But for PAMTs that
> > > > are used to track guest-assigned memory, which is the vaaast majority
> > > > of PAMT memory, why not hook guest_memfd?
> > > 
> > > This seems fine for 4K page backing. But when TDX VMs have huge page
> > > backing, the vast majority of private memory memory wouldn't need PAMT
> > > allocation for 4K granularity.
> > > 
> > > IIUC guest_memfd allocation happening at 2M granularity doesn't
> > > necessarily translate to 2M mapping in guest EPT entries. If the DPAMT
> > > support is to be properly utilized for huge page backings, there is a
> > > value in not attaching PAMT allocation with guest_memfd allocation.

I don't disagree, but the host needs to plan for the worst, especially since the
guest can effectively dictate the max page size of S-EPT mappings.  AFAIK, there
are no plans to support memory overcommit for TDX guests, so unless a deployment
wants to roll the dice and hope TDX guests will use hugepages for N% of their
memory, the host will want to reserve 0.4% of guest memory for PAMTs to ensure
it doesn't unintentionally DoS the guest with an OOM condition.

Ditto for any use case that wants to support dirty logging (ugh), because dirty
logging will require demoting all of guest memory to 4KiB mappings.

> > Right.
> > 
> > It also requires special handling in many places in core-mm. Like, what
> > happens if THP in guest memfd got split. Who would allocate PAMT for it?

guest_memfd?  I don't see why core-mm would need to get involved.  And I definitely
don't see how handling page splits in guest_memfd would be more complicated than
handling them in KVM's MMU.

> > Migration will be more complicated too (when we get there).

Which type of migration?  Live migration or page migration?

> I actually went down this path too, but the problem I hit was that TDX module
> wants the PAMT page size to match the S-EPT page size. 

Right, but over-populating the PAMT would just result in "wasted" memory, correct?
I.e. KVM can always provide more PAMT entries than are needed.  Or am I
misunderstanding how dynamic PAMT works?

In other words, IMO, reclaiming PAMT pages on-demand is also a premature optimization
of sorts, as it's not obvious to me that the host would actually be able to take
advantage of the unused memory.

> And the S-EPT size will depend on runtime behavior of the guest. I'm not sure
> why TDX module requires this though. Kirill, I'd be curious to understand the
> constraint more if you recall.
> 
> But in any case, it seems there are multiple reasons.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ