[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9df04c57f9d5f351bb1b4eeef764bf9ccc6711b1.camel@amazon.com>
Date: Sat, 2 Nov 2024 08:24:15 +0000
From: "Gowans, James" <jgowans@...zon.com>
To: "jgg@...pe.ca" <jgg@...pe.ca>
CC: "quic_eberman@...cinc.com" <quic_eberman@...cinc.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "rppt@...nel.org"
<rppt@...nel.org>, "brauner@...nel.org" <brauner@...nel.org>,
"anthony.yznaga@...cle.com" <anthony.yznaga@...cle.com>,
"steven.sistare@...cle.com" <steven.sistare@...cle.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Durrant,
Paul" <pdurrant@...zon.co.uk>, "Woodhouse, David" <dwmw@...zon.co.uk>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "seanjc@...gle.com"
<seanjc@...gle.com>, "linux-mm@...ck.org" <linux-mm@...ck.org>, "Saenz
Julienne, Nicolas" <nsaenz@...zon.es>, "Graf (AWS), Alexander"
<graf@...zon.de>, "viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"jack@...e.cz" <jack@...e.cz>, "linux-fsdevel@...r.kernel.org"
<linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 05/10] guestmemfs: add file mmap callback
On Fri, 2024-11-01 at 10:42 -0300, Jason Gunthorpe wrote:
>
> On Fri, Nov 01, 2024 at 01:01:00PM +0000, Gowans, James wrote:
>
> > Thanks Jason, that sounds perfect. I'll work on the next rev which will:
> > - expose a filesystem which owns reserved/persistent memory, just like
> > this patch.
>
> Is this step needed?
>
> If the guest memfd is already told to get 1G pages in some normal way,
> why do we need a dedicated pool just for the KHO filesystem?
>
> Back to my suggestion, can't KHO simply freeze the guest memfd and
> then extract the memory layout, and just use the normal allocator?
>
> Or do you have a hard requirement that only KHO allocated memory can
> be preserved across kexec?
KHO can persist any memory ranges which are not MOVABLE. Provided that
guest_memfd does non-movable allocations then serialising and persisting
should be possible.
There are other requirements here, specifically the ability to be
*guaranteed* GiB-level allocations, have the guest memory out of the
direct map for secret hiding, and remove the struct page overhead.
Struct page overhead could be handled via HVO. But considering that the
memory must be out of the direct map it seems unnecessary to have struct
pages, and unnecessary to have it managed by an existing allocator. The
only existing 1 GiB allocator I know of is hugetlbfs? Let me know if
there's something else that can be used.
That's the main motivation for a separate pool allocated on early boot.
This is quite similar to hugetlbfs, so a natural question is if we could
use and serialise hugetlbfs instead, but that probably opens another can
of worms of complexity.
There's more than just the guest_memfds and their allocations to
serialise; it's probably useful to be able to have a directory structure
in the filesystem, POSIX file ACLs, and perhaps some other filesystem
metadata. For this reason I still think that having a new filesystem
designed for this use-case which creates guest_memfd objects when files
are opened is the way to go.
Let me know what you think.
JG
Powered by blists - more mailing lists