lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8e783fa6ee3997567c661e5c10b05b5d456382fb.camel@intel.com>
Date: Fri, 16 May 2025 19:14:46 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "palmer@...belt.com" <palmer@...belt.com>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "catalin.marinas@....com" <catalin.marinas@....com>,
	"Miao, Jun" <jun.miao@...el.com>, "nsaenz@...zon.es" <nsaenz@...zon.es>,
	"pdurrant@...zon.co.uk" <pdurrant@...zon.co.uk>, "vbabka@...e.cz"
	<vbabka@...e.cz>, "peterx@...hat.com" <peterx@...hat.com>, "x86@...nel.org"
	<x86@...nel.org>, "jack@...e.cz" <jack@...e.cz>, "tabba@...gle.com"
	<tabba@...gle.com>, "quic_svaddagi@...cinc.com" <quic_svaddagi@...cinc.com>,
	"amoorthy@...gle.com" <amoorthy@...gle.com>, "pvorel@...e.cz"
	<pvorel@...e.cz>, "vkuznets@...hat.com" <vkuznets@...hat.com>,
	"mail@...iej.szmigiero.name" <mail@...iej.szmigiero.name>, "Annapurve,
 Vishal" <vannapurve@...gle.com>, "anthony.yznaga@...cle.com"
	<anthony.yznaga@...cle.com>, "Wang, Wei W" <wei.w.wang@...el.com>,
	"keirf@...gle.com" <keirf@...gle.com>, "Wieczor-Retman, Maciej"
	<maciej.wieczor-retman@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>,
	"ajones@...tanamicro.com" <ajones@...tanamicro.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, "rppt@...nel.org" <rppt@...nel.org>,
	"quic_mnalajal@...cinc.com" <quic_mnalajal@...cinc.com>, "aik@....com"
	<aik@....com>, "usama.arif@...edance.com" <usama.arif@...edance.com>,
	"fvdl@...gle.com" <fvdl@...gle.com>, "paul.walmsley@...ive.com"
	<paul.walmsley@...ive.com>, "bfoster@...hat.com" <bfoster@...hat.com>,
	"quic_cvanscha@...cinc.com" <quic_cvanscha@...cinc.com>,
	"willy@...radead.org" <willy@...radead.org>, "Du, Fan" <fan.du@...el.com>,
	"quic_eberman@...cinc.com" <quic_eberman@...cinc.com>,
	"thomas.lendacky@....com" <thomas.lendacky@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mic@...ikod.net" <mic@...ikod.net>, "oliver.upton@...ux.dev"
	<oliver.upton@...ux.dev>, "akpm@...ux-foundation.org"
	<akpm@...ux-foundation.org>, "steven.price@....com" <steven.price@....com>,
	"muchun.song@...ux.dev" <muchun.song@...ux.dev>, "binbin.wu@...ux.intel.com"
	<binbin.wu@...ux.intel.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
	"rientjes@...gle.com" <rientjes@...gle.com>, "Aktas, Erdem"
	<erdemaktas@...gle.com>, "mpe@...erman.id.au" <mpe@...erman.id.au>,
	"david@...hat.com" <david@...hat.com>, "jgg@...pe.ca" <jgg@...pe.ca>,
	"hughd@...gle.com" <hughd@...gle.com>, "Xu, Haibo1" <haibo1.xu@...el.com>,
	"jhubbard@...dia.com" <jhubbard@...dia.com>, "anup@...infault.org"
	<anup@...infault.org>, "maz@...nel.org" <maz@...nel.org>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>, "jthoughton@...gle.com" <jthoughton@...gle.com>,
	"steven.sistare@...cle.com" <steven.sistare@...cle.com>,
	"quic_pheragu@...cinc.com" <quic_pheragu@...cinc.com>, "jarkko@...nel.org"
	<jarkko@...nel.org>, "Shutemov, Kirill" <kirill.shutemov@...el.com>,
	"chenhuacai@...nel.org" <chenhuacai@...nel.org>, "Huang, Kai"
	<kai.huang@...el.com>, "shuah@...nel.org" <shuah@...nel.org>,
	"dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "pankaj.gupta@....com"
	<pankaj.gupta@....com>, "Peng, Chao P" <chao.p.peng@...el.com>,
	"nikunj@....com" <nikunj@....com>, "Graf, Alexander" <graf@...zon.com>,
	"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "yuzenghui@...wei.com" <yuzenghui@...wei.com>,
	"jroedel@...e.de" <jroedel@...e.de>, "suzuki.poulose@....com"
	<suzuki.poulose@....com>, "jgowans@...zon.com" <jgowans@...zon.com>, "Xu,
 Yilun" <yilun.xu@...el.com>, "liam.merwick@...cle.com"
	<liam.merwick@...cle.com>, "michael.roth@....com" <michael.roth@....com>,
	"quic_tsoni@...cinc.com" <quic_tsoni@...cinc.com>,
	"richard.weiyang@...il.com" <richard.weiyang@...il.com>, "Weiny, Ira"
	<ira.weiny@...el.com>, "aou@...s.berkeley.edu" <aou@...s.berkeley.edu>, "Li,
 Xiaoyao" <xiaoyao.li@...el.com>, "qperret@...gle.com" <qperret@...gle.com>,
	"kent.overstreet@...ux.dev" <kent.overstreet@...ux.dev>,
	"dmatlack@...gle.com" <dmatlack@...gle.com>, "james.morse@....com"
	<james.morse@....com>, "brauner@...nel.org" <brauner@...nel.org>,
	"hch@...radead.org" <hch@...radead.org>, "ackerleytng@...gle.com"
	<ackerleytng@...gle.com>, "linux-fsdevel@...r.kernel.org"
	<linux-fsdevel@...r.kernel.org>, "pgonda@...gle.com" <pgonda@...gle.com>,
	"quic_pderrin@...cinc.com" <quic_pderrin@...cinc.com>, "roypat@...zon.co.uk"
	<roypat@...zon.co.uk>, "will@...nel.org" <will@...nel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd

On Fri, 2025-05-16 at 10:51 -0700, Sean Christopherson wrote:
> From my perspective, 1GiB hugepage support in guest_memfd isn't about improving
> CoCo performance, it's about achieving feature parity on guest_memfd with respect
> to existing backing stores so that it's possible to use guest_memfd to back all
> VM shapes in a fleet.
> 
> Let's assume there is significant value in backing non-CoCo VMs with 1GiB pages,
> unless you want to re-litigate the existence of 1GiB support in HugeTLBFS.

I didn't expect to go in that direction when I first asked. But everyone says
huge, but no one knows the numbers. It can be a sign of things.

Meanwhile I'm watching patches to make 5 level paging walks unconditional fly by
because people couldn't find a cost to the extra level of walk. So re-litigate,
no. But I'll probably remain quietly suspicious of the exact cost/value. At
least on the CPU side, I totally missed the IOTLB side at first, sorry.

> 
> If we assume 1GiB support is mandatory for non-CoCo VMs, then it becomes mandatory
> for CoCo VMs as well, because it's the only realistic way to run CoCo VMs and
> non-CoCo VMs on a single host.  Mixing 1GiB HugeTLBFS with any other backing store
> for VMs simply isn't tenable due to the nature of 1GiB allocations.  E.g. grabbing
> sub-1GiB chunks of memory for CoCo VMs quickly fragments memory to the point where
> HugeTLBFS can't allocate memory for non-CoCo VMs.

It makes sense that there would be a difference in how many huge pages the non-
coco guests would get. Where I start to lose you is when you guys talk about
"mandatory" or similar. If you want upstream review, it would help to have more
numbers on the "why" question. At least for us folks outside the hyperscalars
where such things are not as obvious.

> 
> Teaching HugeTLBFS to play nice with TDX and SNP isn't happening, which leaves
> adding 1GiB support to guest_memfd as the only way forward.
> 
> Any boost to TDX (or SNP) performance is purely a bonus.

Most of the bullets in the talk were about mapping sizes AFAICT, so this is the
kind of reasoning I was hoping for. Thanks for elaborating on it, even though
still no one has any numbers besides the vmemmap savings.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ