[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8cci0nNtwja8gyR@google.com>
Date: Tue, 4 Mar 2025 07:30:19 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Ackerley Tng <ackerleytng@...gle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, shivankg@....com, akpm@...ux-foundation.org,
willy@...radead.org, pbonzini@...hat.com, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-coco@...ts.linux.dev, chao.gao@...el.com, david@...hat.com,
bharata@....com, nikunj@....com, michael.day@....com, Neeraj.Upadhyay@....com,
thomas.lendacky@....com, michael.roth@....com, tabba@...gle.com
Subject: Re: [PATCH v6 4/5] KVM: guest_memfd: Enforce NUMA mempolicy using
shared policy
On Tue, Mar 04, 2025, Ackerley Tng wrote:
> Vlastimil Babka <vbabka@...e.cz> writes:
> >> struct shared_policy should be stored on the inode rather than the file,
> >> since the memory policy is a property of the memory (struct inode),
> >> rather than a property of how the memory is used for a given VM (struct
> >> file).
> >
> > That makes sense. AFAICS shmem also uses inodes to store policy.
> >
> >> When the shared_policy is stored on the inode, intra-host migration [1]
> >> will work correctly, since the while the inode will be transferred from
> >> one VM (struct kvm) to another, the file (a VM's view/bindings of the
> >> memory) will be recreated for the new VM.
> >>
> >> I'm thinking of having a patch like this [2] to introduce inodes.
> >
> > shmem has it easier by already having inodes
> >
> >> With this, we shouldn't need to pass file pointers instead of inode
> >> pointers.
> >
> > Any downsides, besides more work needed? Or is it feasible to do it using
> > files now and convert to inodes later?
> >
> > Feels like something that must have been discussed already, but I don't
> > recall specifics.
>
> Here's where Sean described file vs inode: "The inode is effectively the
> raw underlying physical storage, while the file is the VM's view of that
> storage." [1].
>
> I guess you're right that for now there is little distinction between
> file and inode and using file should be feasible, but I feel that this
> dilutes the original intent.
Hmm, and using the file would be actively problematic at some point. One could
argue that NUMA policy is property of the VM accessing the memory, i.e. that two
VMs mapping the same guest_memfd could want different policies. But in practice,
that would allow for conflicting requirements, e.g. different policies in each
VM for the same chunk of memory, and would likely lead to surprising behavior due
to having to manually do mbind() for every VM/file view.
> Something like [2] doesn't seem like too big of a change and could perhaps be
> included earlier rather than later, since it will also contribute to support
> for restricted mapping [3].
>
> [1] https://lore.kernel.org/all/ZLGiEfJZTyl7M8mS@google.com/
> [2] https://lore.kernel.org/all/d1940d466fc69472c8b6dda95df2e0522b2d8744.1726009989.git.ackerleytng@google.com/
> [3] https://lore.kernel.org/all/20250117163001.2326672-1-tabba@google.com/T/
Powered by blists - more mailing lists