lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPD-dbl5KWNSHu5R@gourry-fedora-PF4VCD3F>
Date: Thu, 16 Oct 2025 10:17:25 -0400
From: Gregory Price <gourry@...rry.net>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Shivank Garg <shivankg@....com>, jgowans@...zon.com, mhocko@...e.com,
	jack@...e.cz, kvm@...r.kernel.org, david@...hat.com,
	linux-btrfs@...r.kernel.org, aik@....com, papaluri@....com,
	kalyazin@...zon.com, peterx@...hat.com, linux-mm@...ck.org,
	clm@...com, ddutile@...hat.com, linux-kselftest@...r.kernel.org,
	shdhiman@....com, gshan@...hat.com, ying.huang@...ux.alibaba.com,
	shuah@...nel.org, roypat@...zon.co.uk, matthew.brost@...el.com,
	linux-coco@...ts.linux.dev, zbestahu@...il.com,
	lorenzo.stoakes@...cle.com, linux-bcachefs@...r.kernel.org,
	ira.weiny@...el.com, dhavale@...gle.com, jmorris@...ei.org,
	willy@...radead.org, hch@...radead.org, chao.gao@...el.com,
	tabba@...gle.com, ziy@...dia.com, rientjes@...gle.com,
	yuzhao@...gle.com, xiang@...nel.org, nikunj@....com,
	serge@...lyn.com, amit@...radead.org, thomas.lendacky@....com,
	ashish.kalra@....com, chao.p.peng@...el.com, yan.y.zhao@...el.com,
	byungchul@...com, michael.day@....com, Neeraj.Upadhyay@....com,
	michael.roth@....com, bfoster@...hat.com, bharata@....com,
	josef@...icpanda.com, Liam.Howlett@...cle.com,
	ackerleytng@...gle.com, dsterba@...e.com, viro@...iv.linux.org.uk,
	jefflexu@...ux.alibaba.com, jaegeuk@...nel.org,
	dan.j.williams@...el.com, surenb@...gle.com, vbabka@...e.cz,
	paul@...l-moore.com, joshua.hahnjy@...il.com, apopple@...dia.com,
	brauner@...nel.org, quic_eberman@...cinc.com, rakie.kim@...com,
	cgzones@...glemail.com, pvorel@...e.cz,
	linux-erofs@...ts.ozlabs.org, kent.overstreet@...ux.dev,
	linux-kernel@...r.kernel.org,
	linux-f2fs-devel@...ts.sourceforge.net, pankaj.gupta@....com,
	linux-security-module@...r.kernel.org, lihongbo22@...wei.com,
	linux-fsdevel@...r.kernel.org, pbonzini@...hat.com,
	akpm@...ux-foundation.org, vannapurve@...gle.com,
	suzuki.poulose@....com, rppt@...nel.org, jgg@...dia.com
Subject: Re: [f2fs-dev] [PATCH kvm-next V11 6/7] KVM: guest_memfd: Enforce
 NUMA mempolicy using shared policy

On Wed, Oct 15, 2025 at 03:48:38PM -0700, Sean Christopherson wrote:
> On Wed, Oct 15, 2025, Gregory Price wrote:
> > why is __kvm_gmem_get_policy using
> > 	mpol_shared_policy_lookup()
> > instead of
> > 	get_vma_policy()
> 
> With the disclaimer that I haven't followed the gory details of this series super
> closely, my understanding is...
> 
> Because the VMA is a means to an end, and we want the policy to persist even if
> the VMA goes away.
> 

Ah, you know, now that i've taken a close look, I can see that you've
essentially modeled this after ipc/shm.c | mm/shmem.c pattern.

What's had me scratching my chin is that shm/shmem already has a
mempolicy pattern which ends up using folio_alloc_mpol() where the
relationship is

tmpfs: sb_info->mpol = default set by user
  create_file: inode inherits copy of sb_info->mpol
    fault:    mpol = shmem_get_pgoff_policy(info, index, order, &ilx);
             folio = folio_alloc_mpol(gfp, order, mpol, ilx, numa_node_id())

So this inode mempolicy in guest_memfd is really acting more as a the
filesystem-default mempolicy, which you want to survive even if userland
never maps the memory/unmaps the memory.

So the relationship is more like

guest_memfd -> creates fd/inode <- copies task mempolicy (if set)
  vm:  allocates memory via filemap_get_folio_mpol()
  userland mmap(fd):
  	creates new inode<->vma mapping
	vma->mpol = kvm_gmem_get_policy()
	calls to set/get_policy/mbind go through kvm_gmem 

This makes sense, sorry for the noise.  Have been tearing apart
mempolicy lately and I'm disliking the general odor coming off
it as a whole.  I had been poking at adding mempolicy support to
filemap and you got there first.  Overall I think there are still
other problems with mempolicy, but this all looks fine as-is.

~Gregory

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ