lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs0ikkqv3ds.fsf@kernel.org>
Date: Fri, 20 Jun 2025 17:28:31 +0200
From: Pratyush Yadav <pratyush@...nel.org>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Mike Rapoport <rppt@...nel.org>,  Pratyush Yadav <pratyush@...nel.org>,
  Jason Gunthorpe <jgg@...pe.ca>,  jasonmiu@...gle.com,  graf@...zon.com,
  changyuanl@...gle.com,  dmatlack@...gle.com,  rientjes@...gle.com,
  corbet@....net,  rdunlap@...radead.org,  ilpo.jarvinen@...ux.intel.com,
  kanie@...ux.alibaba.com,  ojeda@...nel.org,  aliceryhl@...gle.com,
  masahiroy@...nel.org,  akpm@...ux-foundation.org,  tj@...nel.org,
  yoann.congal@...le.fr,  mmaurer@...gle.com,  roman.gushchin@...ux.dev,
  chenridong@...wei.com,  axboe@...nel.dk,  mark.rutland@....com,
  jannh@...gle.com,  vincent.guittot@...aro.org,  hannes@...xchg.org,
  dan.j.williams@...el.com,  david@...hat.com,  joel.granados@...nel.org,
  rostedt@...dmis.org,  anna.schumaker@...cle.com,  song@...nel.org,
  zhangguopeng@...inos.cn,  linux@...ssschuh.net,
  linux-kernel@...r.kernel.org,  linux-doc@...r.kernel.org,
  linux-mm@...ck.org,  gregkh@...uxfoundation.org,  tglx@...utronix.de,
  mingo@...hat.com,  bp@...en8.de,  dave.hansen@...ux.intel.com,
  x86@...nel.org,  hpa@...or.com,  rafael@...nel.org,  dakr@...nel.org,
  bartosz.golaszewski@...aro.org,  cw00.choi@...sung.com,
  myungjoo.ham@...sung.com,  yesanishhere@...il.com,
  Jonathan.Cameron@...wei.com,  quic_zijuhu@...cinc.com,
  aleksander.lobakin@...el.com,  ira.weiny@...el.com,
  andriy.shevchenko@...ux.intel.com,  leon@...nel.org,  lukas@...ner.de,
  bhelgaas@...gle.com,  wagi@...nel.org,  djeffery@...hat.com,
  stuart.w.hayes@...il.com
Subject: Re: [RFC v2 05/16] luo: luo_core: integrate with KHO

Hi Pasha,

On Thu, Jun 19 2025, Pasha Tatashin wrote:

[...]
>> And it has to be done before kexec load, at least until we resolve this.
>
> The before kexec load constrained has been fixed. The only
> "finalization" constraint we have is it should be before
> reboot(LINUX_REBOOT_CMD_KEXEC) and only because memory allocations
> during kernel shutdown are undesirable. Once KHO moves away from a
> monolithic state machine this constraint disappears. Kernel components
> could preserve their resources at appropriate times, not necessarily
> tied to a shutdown-time. For live update scenarios, LUO already
> orchestrates this timing.
>
>> Currently this is triggered either by KHO debugfs or by LUO ioctls. If we
>> completely drop KHO debugfs and notifiers, we still need something that
>> would trigger the magic.
>
> An external "magic trigger" for KHO (like the current finalize
> notifier or debugfs command) is necessary for scenarios like live
> update, where userspace resources are being preserved in a coordinated
> fashion just before kexec.
>
> For kernel-internal resources that are unrelated to such a
> userspace-driven live update flow, the respective kernel components
> should directly use KHO's primitive preservation APIs
> (kho_preserve_folio, etc.) when they need to mark their resources for
> handover. No separate, state machine or external trigger should be
> required for these individual, self-contained preservation acts.

For kernel-internal components, I think this makes a lot of sense,
especially now that we don't need to get everything done by kexec load
time. I suppose the liveupdate_reboot() call at reboot time to prepare
final things can be useful, but subsystems can just as well register
reboot notifiers to get the same notification.

>
>> I'm not saying we should keep KHO debugfs and notifiers, I'm saying that if
>> we make LUO the only thing driving KHO, liveupdate is not an appropriate
>> name.
>
> LUO drives KHO specifically for the purpose of live updates. If a
> different userspace use-case emerges that needs another distinct
> purpose (e.g., not to preserve a FD a or a device across kernel reboot
> (i.e. something for which LUO does not provide uAPI)), then that would
> probably need a separate from LUO uAPI instead of extending the LUO
> uAPI.

Outside of hypervisor live update, I have a very clear use case in mind:
userspace memory handover (on guest side). Say a guest running an
in-memory cache like memcached with many gigabytes of cache wants to
reboot. It can just shove the cache into a memfd, give it to LUO, and
restore it after reboot. Some services that suffer from long reboots are
looking into using this to reduce downtime. Since it pretty much
overlaps with the hypervisor work for now, I haven't been talking about
it as much.

Would you also call this use case "live update"? Does it also fit with
your vision of where LUO should go?

If not, why do you think we should have a parallel set of uAPIs that do
similar work? Why can't we accommodate other use cases under one API,
especially as long as they don't have conflicting goals? In practice,
outside of s/luo/khoctl/g, I don't think much would change as of now.
The state machine and APIs will stay the same.

When those use cases start to diverge from the liveupdate, or conflict
with it, we can then decide to have a separate interface for them, but
when going the other way round, we won't end up with a somewhat
confusing name for a more widely applicable technology.

I've been thinking about the naming since the start, but I didn't want
to bikeshed on it too much. But if we are also talking about the scope
of LUO, then I think this is a conversation worth having.

PS: I don't have real data, but I have a feeling that after luo/khoctl
    mature, more use cases will come out of the woodwork to optimize
    reboots.

-- 
Regards,
Pratyush Yadav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ