lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs0sejmse57.fsf@kernel.org>
Date: Thu, 26 Jun 2025 17:42:28 +0200
From: Pratyush Yadav <pratyush@...nel.org>
To: David Matlack <dmatlack@...gle.com>
Cc: Christian Brauner <brauner@...nel.org>,  Pasha Tatashin
 <pasha.tatashin@...een.com>,  pratyush@...nel.org,  jasonmiu@...gle.com,
  graf@...zon.com,  changyuanl@...gle.com,  rppt@...nel.org,
  rientjes@...gle.com,  corbet@....net,  rdunlap@...radead.org,
  ilpo.jarvinen@...ux.intel.com,  kanie@...ux.alibaba.com,
  ojeda@...nel.org,  aliceryhl@...gle.com,  masahiroy@...nel.org,
  akpm@...ux-foundation.org,  tj@...nel.org,  yoann.congal@...le.fr,
  mmaurer@...gle.com,  roman.gushchin@...ux.dev,  chenridong@...wei.com,
  axboe@...nel.dk,  mark.rutland@....com,  jannh@...gle.com,
  vincent.guittot@...aro.org,  hannes@...xchg.org,
  dan.j.williams@...el.com,  david@...hat.com,  joel.granados@...nel.org,
  rostedt@...dmis.org,  anna.schumaker@...cle.com,  song@...nel.org,
  zhangguopeng@...inos.cn,  linux@...ssschuh.net,
  linux-kernel@...r.kernel.org,  linux-doc@...r.kernel.org,
  linux-mm@...ck.org,  gregkh@...uxfoundation.org,  tglx@...utronix.de,
  mingo@...hat.com,  bp@...en8.de,  dave.hansen@...ux.intel.com,
  x86@...nel.org,  hpa@...or.com,  rafael@...nel.org,  dakr@...nel.org,
  bartosz.golaszewski@...aro.org,  cw00.choi@...sung.com,
  myungjoo.ham@...sung.com,  yesanishhere@...il.com,
  Jonathan.Cameron@...wei.com,  quic_zijuhu@...cinc.com,
  aleksander.lobakin@...el.com,  ira.weiny@...el.com,
  andriy.shevchenko@...ux.intel.com,  leon@...nel.org,  lukas@...ner.de,
  bhelgaas@...gle.com,  wagi@...nel.org,  djeffery@...hat.com,
  stuart.w.hayes@...il.com
Subject: Re: [RFC v2 10/16] luo: luo_ioctl: add ioctl interface

On Wed, Jun 25 2025, David Matlack wrote:

> On Wed, Jun 25, 2025 at 2:36 AM Christian Brauner <brauner@...nel.org> wrote:
>> >
>> > While I agree that a filesystem offers superior introspection and
>> > integration with standard tools, building this complex, stateful
>> > orchestration logic on top of VFS seemed to be forcing a square peg
>> > into a round hole. The ioctl interface, while more opaque, provides a
>> > direct and explicit way to command the state machine and manage these
>> > complex lifecycle and dependency rules.
>>
>> I'm not going to argue that you have to switch to this kexecfs idea
>> but...
>>
>> You're using a character device that's tied to devmptfs. In other words,
>> you're already using a filesystem interface. Literally the whole code
>> here is built on top of filesystem APIs. So this argument is just very
>> wrong imho. If you can built it on top of a character device using VFS
>> interfaces you can do it as a minimal filesystem.
>>
>> You're free to define the filesystem interface any way you like it. We
>> have a ton of examples there. All your ioctls would just be tied to the
>> fileystem instance instead of the /dev/somethingsomething character
>> device. The state machine could just be implemented the same way.
>>
>> One of my points is that with an fs interface you can have easy state
>> seralization on a per-service level. IOW, you have a bunch of virtual
>> machines running as services or some networking services or whatever.
>> You could just bind-mount an instance of kexecfs into the service and
>> the service can persist state into the instance and easily recover it
>> after kexec.
>
> This approach sounds worth exploring more. It would avoid the need for
> a centralized daemon to mediate the preservation and restoration of
> all file descriptors.

One of the jobs of the centralized daemon is to decide the _policy_ of
who gets to preserve things and more importantly, make sure the right
party unpreserves the right FDs after a kexec. I don't see how this
interface fixes this problem. You would still need a way to identify
which kexecfs instance belongs to who and enforce that. The kernel
probably shouldn't be the one doing this kind of policy so you still
need some userspace component to make those decisions.

>
> I'm not sure that we can get rid of the machine-wide state machine
> though, as there is some kernel state that will necessarily cross
> these kexecfs domains (e.g. IOMMU driver state). So we still might
> need /dev/liveupdate for that.

Generally speaking, I think both VFS-based and IOCTL-based interfaces
are more or less equally expressive/powerful. Most of the ioctl
operations can be translated to a VFS operation and vice versa.

For example, the fsopen() call is similar to open("/dev/liveupdate") --
both would create a live update session which auto closes when the FD is
closed or FS unmounted. Similarly, each ioctl can be replaced with a
file in the FS. For example, LIVEUPDATE_IOCTL_FD_PRESERVE can be
replaced with a fd_preserve file where you write() the FD number.
LIVEUPDATE_IOCTL_GET_STATE or LIVEUPDATE_IOCTL_PREPARE, etc. can be
replaced by a "state" file where you can read() or write() the state.

I think the main benefit of the VFS-based interface is ease of use.
There already exist a bunch of utilites and libraries that we can use to
interact with files. When we have ioctls, we would need to write
everything ourselves. For example, instead of
LIVEUPDATE_IOCTL_GET_STATE, you can do "cat state", which is a bit
easier to do.

As for downsides, I think we might end up with a bit more boilerplate
code, but beyond that I am not sure.

-- 
Regards,
Pratyush Yadav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ