[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bB9r_pMzd0VbLsAGTwh8kvV_o3rFM_W--drutewomr1ZQ@mail.gmail.com>
Date: Tue, 26 Aug 2025 17:03:59 +0000
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Pratyush Yadav <pratyush@...nel.org>, jasonmiu@...gle.com, graf@...zon.com,
changyuanl@...gle.com, rppt@...nel.org, dmatlack@...gle.com,
rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com, ojeda@...nel.org,
aliceryhl@...gle.com, masahiroy@...nel.org, akpm@...ux-foundation.org,
tj@...nel.org, yoann.congal@...le.fr, mmaurer@...gle.com,
roman.gushchin@...ux.dev, chenridong@...wei.com, axboe@...nel.dk,
mark.rutland@....com, jannh@...gle.com, vincent.guittot@...aro.org,
hannes@...xchg.org, dan.j.williams@...el.com, david@...hat.com,
joel.granados@...nel.org, rostedt@...dmis.org, anna.schumaker@...cle.com,
song@...nel.org, zhangguopeng@...inos.cn, linux@...ssschuh.net,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, linux-mm@...ck.org,
gregkh@...uxfoundation.org, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
rafael@...nel.org, dakr@...nel.org, bartosz.golaszewski@...aro.org,
cw00.choi@...sung.com, myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, lennart@...ttering.net, brauner@...nel.org,
linux-api@...r.kernel.org, linux-fsdevel@...r.kernel.org, saeedm@...dia.com,
ajayachandra@...dia.com, parav@...dia.com, leonro@...dia.com, witu@...dia.com
Subject: Re: [PATCH v3 00/30] Live Update Orchestrator
> > The existing interface, with the addition of passing a pidfd, provides
> > the necessary flexibility without being invasive. The change would be
> > localized to the new code that performs the FD retrieval and wouldn't
> > involve spoofing current or making widespread changes.
> > For example, to handle cgroup charging for a memfd, the flow inside
> > memfd_luo_retrieve() would look something like this:
> >
> > task = get_pid_task(target_pid, PIDTYPE_PID);
> > mm = get_task_mm(task);
> > // ...
> > folio = kho_restore_folio(phys);
> > // Charge to the target mm, not 'current->mm'
> > mem_cgroup_charge(folio, mm, ...);
> > mmput(mm);
> > put_task_struct(task);
>
> Execpt it doesn't work like that in all places, iommufd for example
> uses GFP_KERNEL_ACCOUNT which relies on current.
That's a good point. For kernel allocations, I don't see a clean way
to account for a different process.
We should not be doing major allocations during the retrieval process
itself. Ideally, the kernel would restore an FD using only the
preserved folio data (that we can cleanly charge), and then let the
user process perform any subsequent actions that might cause new
kernel memory allocations. However, I can see how that might not be
practical for all handlers.
Perhaps, we should add session extensions to the kernel as follow-up
after this series lands, we would also need to rewrite luod design
accordingly to move some of the sessions logic into the kernel.
Thank you,
Pasha
Powered by blists - more mailing lists