linux-kernel - Re: [RFC PATCH 09/21] KVM: TDX: Enable 2MB mapping size after TD is RUNNABLE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d92b841268888d2e41cd567678a412b2bd829a0b.camel@intel.com>
Date: Wed, 25 Jun 2025 15:51:07 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "Annapurve, Vishal" <vannapurve@...gle.com>
CC: "quic_eberman@...cinc.com" <quic_eberman@...cinc.com>, "Li, Xiaoyao"
	<xiaoyao.li@...el.com>, "Huang, Kai" <kai.huang@...el.com>, "Du, Fan"
	<fan.du@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
	"david@...hat.com" <david@...hat.com>, "thomas.lendacky@....com"
	<thomas.lendacky@....com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>, "Li,
 Zhiquan1" <zhiquan1.li@...el.com>, "Shutemov, Kirill"
	<kirill.shutemov@...el.com>, "michael.roth@....com" <michael.roth@....com>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "Weiny, Ira" <ira.weiny@...el.com>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>, "Peng, Chao P"
	<chao.p.peng@...el.com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"vbabka@...e.cz" <vbabka@...e.cz>, "ackerleytng@...gle.com"
	<ackerleytng@...gle.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "tabba@...gle.com"
	<tabba@...gle.com>, "jroedel@...e.de" <jroedel@...e.de>, "Miao, Jun"
	<jun.miao@...el.com>, "pgonda@...gle.com" <pgonda@...gle.com>,
	"x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH 09/21] KVM: TDX: Enable 2MB mapping size after TD is
 RUNNABLE

On Wed, 2025-06-25 at 06:47 -0700, Vishal Annapurve wrote:
> On Tue, Jun 24, 2025 at 11:36 AM Edgecombe, Rick P
> <rick.p.edgecombe@...el.com> wrote:
> > ...
> > For leaving the option open to promote the GFNs in the future, a GHCI interface
> > or similar could be defined for the guest to say "I don't care about page size
> > anymore for this gfn". So it won't close it off forever.
> > 
> 
> I think it's in the host's interest to get the pages mapped at large
> page granularity whenever possible. Even if guest doesn't buy-in into
> the "future" GHCI interface, there should be some ABI between TDX
> module and host VMM to allow promotion probably as soon as all the
> ranges within a hugepage get accepted but are still mapped at 4K
> granularity.

In the 4k accept size, the guest is kind of requesting a specific host page
size. I agree it's not good to let the guest influence the host's resource
usage. But this already happens with private/shared conversions.

As for future promotion opportunities, I think that part needs a re-think. I
don't think cost/benefit is really there today. If we had a simpler solution (we
discussed some TDX module changes offline), then it changes the calculus. But we
shouldn't focus too much on the ideal TDX implementation. Getting the ideal case
upstream is far, far away. In the meantime we should focus on the simplest
things with the most benefit. In the end I'd expect an iterative, evolving
implementation to be faster to upstream then thinking through how it works with
every idea. The exception is thinking through a sane ABI ahead of time.

I don't think we necessarily need a GHCI interface to expose control of host
page sizes to the guest, but I think it might help with determinism. I meant it
sort of as an escape hatch. Like if we find some nasty races that prevent
optimizations for promotion, we could have an option to have the guest help by
making the ABI around page sizes more formal.

Side topic on page promotion, I'm wondering if the biggest bang-for-the-buck
promotion opportunity will be the memory that gets added via PAGE.ADD at TD
startup time. Which is a narrow specific case that may be easier to attack.