[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs0frdcyib8.fsf@kernel.org>
Date: Wed, 27 Aug 2025 16:01:47 +0200
From: Pratyush Yadav <pratyush@...nel.org>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, Pratyush Yadav <pratyush@...nel.org>,
jasonmiu@...gle.com, graf@...zon.com, changyuanl@...gle.com,
rppt@...nel.org, dmatlack@...gle.com, rientjes@...gle.com,
corbet@....net, rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com,
kanie@...ux.alibaba.com, ojeda@...nel.org, aliceryhl@...gle.com,
masahiroy@...nel.org, akpm@...ux-foundation.org, tj@...nel.org,
yoann.congal@...le.fr, mmaurer@...gle.com, roman.gushchin@...ux.dev,
chenridong@...wei.com, axboe@...nel.dk, mark.rutland@....com,
jannh@...gle.com, vincent.guittot@...aro.org, hannes@...xchg.org,
dan.j.williams@...el.com, david@...hat.com, joel.granados@...nel.org,
rostedt@...dmis.org, anna.schumaker@...cle.com, song@...nel.org,
zhangguopeng@...inos.cn, linux@...ssschuh.net,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, lennart@...ttering.net, brauner@...nel.org,
linux-api@...r.kernel.org, linux-fsdevel@...r.kernel.org,
saeedm@...dia.com, ajayachandra@...dia.com, parav@...dia.com,
leonro@...dia.com, witu@...dia.com
Subject: Re: [PATCH v3 00/30] Live Update Orchestrator
On Tue, Aug 26 2025, Pasha Tatashin wrote:
>> > The existing interface, with the addition of passing a pidfd, provides
>> > the necessary flexibility without being invasive. The change would be
>> > localized to the new code that performs the FD retrieval and wouldn't
>> > involve spoofing current or making widespread changes.
>> > For example, to handle cgroup charging for a memfd, the flow inside
>> > memfd_luo_retrieve() would look something like this:
>> >
>> > task = get_pid_task(target_pid, PIDTYPE_PID);
>> > mm = get_task_mm(task);
>> > // ...
>> > folio = kho_restore_folio(phys);
>> > // Charge to the target mm, not 'current->mm'
>> > mem_cgroup_charge(folio, mm, ...);
>> > mmput(mm);
>> > put_task_struct(task);
>> >
>> > This approach seems quite contained, and does not modify the existing
>> > interfaces. It avoids the need for the kernel to manage the entire
>> > session state and its associated security model.
Even with sessions, I don't think the kernel has to deal with the
security model. /dev/liveupdate can still be single-open only, with only
luod getting access to it. The the kernel just hands over sessions to
luod (maybe with a new ioctl LIVEUPDATE_IOCTL_CREATE_SESSION), and luod
takes care of the security model and lifecycle. If luod crashes and
loses its handle to /dev/liveupdate, all the sessions associated with it
go away too.
Essentially, the sessions from kernel perspective would just be a
container to group different resources together. I think this adds a
small bit of complexity on the session management and serialization
side, but I think will save complexity on participating subsystems.
>>
>> Execpt it doesn't work like that in all places, iommufd for example
>> uses GFP_KERNEL_ACCOUNT which relies on current.
>
> That's a good point. For kernel allocations, I don't see a clean way
> to account for a different process.
>
> We should not be doing major allocations during the retrieval process
> itself. Ideally, the kernel would restore an FD using only the
> preserved folio data (that we can cleanly charge), and then let the
> user process perform any subsequent actions that might cause new
> kernel memory allocations. However, I can see how that might not be
> practical for all handlers.
>
> Perhaps, we should add session extensions to the kernel as follow-up
> after this series lands, we would also need to rewrite luod design
> accordingly to move some of the sessions logic into the kernel.
I know the KHO is supposed to not be backwards compatible yet. What is
the goal for the LUO APIs? Are they also not backwards compatible? If
not, I think we should also consider how sessions will play into
backwards compatibility. For example, once we add sessions, what happens
to the older versions of luod that directly call preserve or unpreserve?
--
Regards,
Pratyush Yadav
Powered by blists - more mailing lists