[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251020115734.GH316284@nvidia.com>
Date: Mon, 20 Oct 2025 08:57:34 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Tzung-Bi Shih <tzungbi@...nel.org>
Cc: Benson Leung <bleung@...omium.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>,
Jonathan Corbet <corbet@....net>, Shuah Khan <shuah@...nel.org>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
chrome-platform@...ts.linux.dev, linux-kselftest@...r.kernel.org,
Laurent Pinchart <laurent.pinchart@...asonboard.com>,
Bartosz Golaszewski <brgl@...ev.pl>,
Wolfram Sang <wsa+renesas@...g-engineering.com>,
Simona Vetter <simona.vetter@...ll.ch>,
Dan Williams <dan.j.williams@...el.com>
Subject: Re: [PATCH v5 5/7] revocable: Add fops replacement
On Sun, Oct 19, 2025 at 11:08:29PM +0800, Tzung-Bi Shih wrote:
> On Fri, Oct 17, 2025 at 01:21:16PM -0300, Jason Gunthorpe wrote:
> > On Sat, Oct 18, 2025 at 12:07:58AM +0800, Tzung-Bi Shih wrote:
> > > > This is already properly lifetime controlled!
> > > >
> > > > It *HAS* to be, and even your patches are assuming it by blindly
> > > > reaching into the parent's memory!
> > > >
> > > > + misc->rps[0] = ec->ec_dev->revocable_provider;
> > > >
> > > > If the parent driver has been racily unbound at this point the
> > > > ec->ec_dev is already a UAF!
> > >
> > > Not really, it uses the fact that the caller is from probe(). I think the
> > > driver can't be unbound when it is still in probe().
> >
> > Right, but that's my point you are already relying on driver binding
> > lifetime rules to make your access valid. You should continue to rely
> > on that and fix the lack of synchronous remove to fix the bug.
>
> I think what you're looking for is something similar to the following
> patches.
>
> - Instead of having a real resource to protect with revocable, use the
> subsystem device itself as a virtual resource. Revoke the virtual
> resource when unregistering the device from the subsystem.
>
> - Exit earlier if the virtual resource is NULL (i.e. the subsystem device
> has been unregistered) in the file operation wrappers.
Sure
> By doing so, we don't need to provide a misc_deregister_sync() which could
> probably maintain a list of opening files in miscdevice and handle with all
> opening files when unregistering.
I don't think we want to change the default behavior of
misc_deregister.. Maybe if it was a mutex not srcu it would be OK, but
srcu you are looking at delaying driver removal by seconds
potentially.
> @@ -234,6 +240,10 @@ int misc_register(struct miscdevice *misc)
> return -EINVAL;
> }
>
> + misc->rp = revocable_provider_alloc(misc);
> + if (!misc->rp)
> + return -ENOMEM;
Just get rid of all this revocable stuff, all this needs is a scru or
a mutex, none of this obfuscation around a simple lock is helpful in
core kernel code.
> @@ -1066,6 +1066,7 @@ struct file {
> freeptr_t f_freeptr;
> };
> /* --- cacheline 3 boundary (192 bytes) --- */
> + struct fs_revocable_replacement *f_rr;
> } __randomize_layout
The thing that will likely attract objections is this. It is probably
a good idea to try to remove it.
For simple misc users the inode->i_cdev will always be valid and you
can reach the struct misc_dev/cdev from there in all the calls.
More complex cdev users replace the inode so that wouldn't work
universally but it is good enough to get started at least.
Jason
Powered by blists - more mailing lists