[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs0h5xmw12a.fsf@kernel.org>
Date: Mon, 01 Sep 2025 19:10:53 +0200
From: Pratyush Yadav <pratyush@...nel.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Pratyush Yadav <pratyush@...nel.org>, Pasha Tatashin
<pasha.tatashin@...een.com>, jasonmiu@...gle.com, graf@...zon.com,
changyuanl@...gle.com, rppt@...nel.org, dmatlack@...gle.com,
rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
vincent.guittot@...aro.org, hannes@...xchg.org,
dan.j.williams@...el.com, david@...hat.com, joel.granados@...nel.org,
rostedt@...dmis.org, anna.schumaker@...cle.com, song@...nel.org,
zhangguopeng@...inos.cn, linux@...ssschuh.net,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, lennart@...ttering.net, brauner@...nel.org,
linux-api@...r.kernel.org, linux-fsdevel@...r.kernel.org,
saeedm@...dia.com, ajayachandra@...dia.com, parav@...dia.com,
leonro@...dia.com, witu@...dia.com
Subject: Re: [PATCH v3 29/30] luo: allow preserving memfd
Hi Jason,
On Thu, Aug 28 2025, Jason Gunthorpe wrote:
> On Wed, Aug 27, 2025 at 05:03:55PM +0200, Pratyush Yadav wrote:
>
>> I think we need something a luo_xarray data structure that users like
>> memfd (and later hugetlb and guest_memfd and maybe others) can build to
>> make serialization easier. It will cover both contiguous arrays and
>> arrays with some holes in them.
>
> I'm not sure xarray is the right way to go, it is very complex data
> structure and building a kho variation of it seems like it is a huge
> amount of work.
>
> I'd stick with simple kvalloc type approaches until we really run into
> trouble.
>
> You can always map a sparse xarray into a kvalloc linear list by
> including the xarray index in each entry.
>
> Especially for memfd where we don't actually expect any sparsity in
> real uses cases there is no reason to invest a huge effort to optimize
> for it..
Full xarray is too complex, sure. But I think a simple sparse array with
xarray-like properties (4-byte pointers, values using xa_mk_value()) is
fairly simple to implement. More advanced features of xarray like
multi-index entries can be added later if needed.
In fact, I have a WIP version of such an array and have used it for
memfd preservation, and it looks quite alright to me. You can find the
code at [0]. It is roughly 300 lines of code. I still need to clean it
up to make it post-able, but it does work.
Building kvalloc on top of this becomes trivial.
[0] https://git.kernel.org/pub/scm/linux/kernel/git/pratyush/linux.git/commit/?h=kho-array&id=cf4c04c1e9ac854e3297018ad6dada17c54a59af
>
>> As I explained above, the versioning is already there. Beyond that, why
>> do you think a raw C struct is better than FDT? It is just another way
>> of expressing the same information. FDT is a bit more cumbersome to
>> write and read, but comes at the benefit of more introspect-ability.
>
> Doesn't have the size limitations, is easier to work list, runs
> faster.
>
>> > luo_store_object(&memfd_luo_v0, sizeof(memfd_luo_v0), <.. identifier for this fd..>, /*version=*/0);
>> > luo_store_object(&memfd_luo_v1, sizeof(memfd_luo_v1), <.. identifier for this fd..>, /*version=*/1);
>>
>> I think what you describe here is essentially how LUO works currently,
>> just that the mechanisms are a bit different.
>
> The bit different is a very important bit though :)
>
> The versioning should be first class, not hidden away as some emergent
> property of registering multiple serializers or something like that.
That makes sense. How about some simple changes to the LUO interfaces to
make the version more prominent:
int (*prepare)(struct liveupdate_file_handler *handler,
struct file *file, u64 *data, char **compatible);
This lets the subsystem fill in the compatible (AKA version) (string
here, but you can make it an integer if you want) when it serialized its
data.
And on restore side, LUO can pass in the compatible:
int (*retrieve)(struct liveupdate_file_handler *handler,
u64 data, char *compatible, struct file **file);
--
Regards,
Pratyush Yadav
Powered by blists - more mailing lists