[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250307151417.GQ354511@nvidia.com>
Date: Fri, 7 Mar 2025 11:14:17 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Christian Brauner <brauner@...nel.org>
Cc: Pratyush Yadav <ptyadav@...zon.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Jonathan Corbet <corbet@....net>,
Eric Biederman <ebiederm@...ssion.com>,
Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.cz>,
Hugh Dickins <hughd@...gle.com>, Alexander Graf <graf@...zon.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
David Woodhouse <dwmw2@...radead.org>,
James Gowans <jgowans@...zon.com>, Mike Rapoport <rppt@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Pasha Tatashin <tatashin@...gle.com>,
Anthony Yznaga <anthony.yznaga@...cle.com>,
Dave Hansen <dave.hansen@...el.com>,
David Hildenbrand <david@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Wei Yang <richard.weiyang@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, kexec@...ts.infradead.org
Subject: Re: [RFC PATCH 1/5] misc: introduce FDBox
On Fri, Mar 07, 2025 at 10:31:39AM +0100, Christian Brauner wrote:
> On Fri, Mar 07, 2025 at 12:57:35AM +0000, Pratyush Yadav wrote:
> > The File Descriptor Box (FDBox) is a mechanism for userspace to name
> > file descriptors and give them over to the kernel to hold. They can
> > later be retrieved by passing in the same name.
> >
> > The primary purpose of FDBox is to be used with Kexec Handover (KHO).
> > There are many kinds anonymous file descriptors in the kernel like
> > memfd, guest_memfd, iommufd, etc. that would be useful to be preserved
> > using KHO. To be able to do that, there needs to be a mechanism to label
> > FDs that allows userspace to set the label before doing KHO and to use
> > the label to map them back after KHO. FDBox achieves that purpose by
> > exposing a miscdevice which exposes ioctls to label and transfer FDs
> > between the kernel and userspace. FDBox is not intended to work with any
> > generic file descriptor. Support for each kind of FDs must be explicitly
> > enabled.
>
> This makes no sense as a generic concept. If you want to restore shmem
> and possibly anonymous inodes files via KHO then tailor the solution to
> shmem and anon inodes but don't make this generic infrastructure. This
> has zero chances to cover generic files.
We need it to cover a range of FD types in the kernel like iommufd and
vfio.
It is not "generic" in the sense every FD in the kernel magicaly works
with fdbox, but that any driver/subsystem providing a FD could be
enlightened to support it.
Very much do not want the infrastructure tied to just shmem and memfd.
> As soon as you're dealing with non-kernel internal mounts that are not
> guaranteed to always be there or something that depends on superblock or
> mount specific information that can change you're already screwed.
This is really targetting at anonymous or character device file
descriptors that don't have issues with mounts.
Same remark about inode permissions and what not. The successor
kernel would be responsible to secure the FDBOX and when it takes
anything out it has to relabel it if required.
inode #s and things can change because this is not something like CRIU
that would have state linked to inode numbers. The applications in the
sucessor kernels are already very special, they will need to cope with
inode number changes along with all the other special stuff they do.
> And struct file should have zero to do with this KHO stuff. It doesn't
> need to carry new operations and it doesn't need to waste precious space
> for any of this.
Yeah, it should go through file_operations in some way.
Jason
Powered by blists - more mailing lists