lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGqJIFs8GpvHn_Yy@kernel.org>
Date: Sun, 6 Jul 2025 17:33:04 +0300
From: Mike Rapoport <rppt@...nel.org>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: David Matlack <dmatlack@...gle.com>,
	Christian Brauner <brauner@...nel.org>,
	Pasha Tatashin <pasha.tatashin@...een.com>, jasonmiu@...gle.com,
	graf@...zon.com, changyuanl@...gle.com, rientjes@...gle.com,
	corbet@....net, rdunlap@...radead.org,
	ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
	ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
	akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
	mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
	axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
	vincent.guittot@...aro.org, hannes@...xchg.org,
	dan.j.williams@...el.com, david@...hat.com,
	joel.granados@...nel.org, rostedt@...dmis.org,
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn,
	linux@...ssschuh.net, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-mm@...ck.org,
	gregkh@...uxfoundation.org, tglx@...utronix.de, mingo@...hat.com,
	bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
	myungjoo.ham@...sung.com, yesanishhere@...il.com,
	Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
	aleksander.lobakin@...el.com, ira.weiny@...el.com,
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
	stuart.w.hayes@...il.com
Subject: Re: [RFC v2 10/16] luo: luo_ioctl: add ioctl interface

On Thu, Jun 26, 2025 at 05:42:28PM +0200, Pratyush Yadav wrote:
> On Wed, Jun 25 2025, David Matlack wrote:
> 
> > On Wed, Jun 25, 2025 at 2:36 AM Christian Brauner <brauner@...nel.org> wrote:
> >> >
> >> > While I agree that a filesystem offers superior introspection and
> >> > integration with standard tools, building this complex, stateful
> >> > orchestration logic on top of VFS seemed to be forcing a square peg
> >> > into a round hole. The ioctl interface, while more opaque, provides a
> >> > direct and explicit way to command the state machine and manage these
> >> > complex lifecycle and dependency rules.
> >>
> >> I'm not going to argue that you have to switch to this kexecfs idea
> >> but...
> >>
> >> You're using a character device that's tied to devmptfs. In other words,
> >> you're already using a filesystem interface. Literally the whole code
> >> here is built on top of filesystem APIs. So this argument is just very
> >> wrong imho. If you can built it on top of a character device using VFS
> >> interfaces you can do it as a minimal filesystem.
> >>
> >> You're free to define the filesystem interface any way you like it. We
> >> have a ton of examples there. All your ioctls would just be tied to the
> >> fileystem instance instead of the /dev/somethingsomething character
> >> device. The state machine could just be implemented the same way.
> >>
> >> One of my points is that with an fs interface you can have easy state
> >> seralization on a per-service level. IOW, you have a bunch of virtual
> >> machines running as services or some networking services or whatever.
> >> You could just bind-mount an instance of kexecfs into the service and
> >> the service can persist state into the instance and easily recover it
> >> after kexec.
> >
> > This approach sounds worth exploring more. It would avoid the need for
> > a centralized daemon to mediate the preservation and restoration of
> > all file descriptors.
> 
> One of the jobs of the centralized daemon is to decide the _policy_ of
> who gets to preserve things and more importantly, make sure the right
> party unpreserves the right FDs after a kexec. I don't see how this
> interface fixes this problem. You would still need a way to identify
> which kexecfs instance belongs to who and enforce that. The kernel
> probably shouldn't be the one doing this kind of policy so you still
> need some userspace component to make those decisions.
> 
> >
> > I'm not sure that we can get rid of the machine-wide state machine
> > though, as there is some kernel state that will necessarily cross
> > these kexecfs domains (e.g. IOMMU driver state). So we still might
> > need /dev/liveupdate for that.
> 
> Generally speaking, I think both VFS-based and IOCTL-based interfaces
> are more or less equally expressive/powerful. Most of the ioctl
> operations can be translated to a VFS operation and vice versa.
> 
> For example, the fsopen() call is similar to open("/dev/liveupdate") --
> both would create a live update session which auto closes when the FD is
> closed or FS unmounted. Similarly, each ioctl can be replaced with a
> file in the FS. For example, LIVEUPDATE_IOCTL_FD_PRESERVE can be
> replaced with a fd_preserve file where you write() the FD number.
> LIVEUPDATE_IOCTL_GET_STATE or LIVEUPDATE_IOCTL_PREPARE, etc. can be
> replaced by a "state" file where you can read() or write() the state.
> 
> I think the main benefit of the VFS-based interface is ease of use.
> There already exist a bunch of utilites and libraries that we can use to
> interact with files. When we have ioctls, we would need to write
> everything ourselves. For example, instead of
> LIVEUPDATE_IOCTL_GET_STATE, you can do "cat state", which is a bit
> easier to do.
>
> As for downsides, I think we might end up with a bit more boilerplate
> code, but beyond that I am not sure.

One of the points in Christian's suggestion was that ioctl doesn't have to
be bound to a misc device. Even if we don't use read()/write()/link() etc,
we can have a filesystem that exposes, say, "control" file and that file
has the same liveupdate_ioctl() in its fops as we have now in miscdev.

The cost is indeed a bit of boilerplate code to create the filesystem, but
it would be easier to extend for per-service and containers support.

And we won't need sysfs entry for status, as it can be also pre-populated
in kexecfs (or whatever it'll be called).
 
> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ