[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <697bfd05bd47f_1d6f10078@dwillia2-mobl4.notmuch>
Date: Thu, 29 Jan 2026 16:36:21 -0800
From: <dan.j.williams@...el.com>
To: Danilo Krummrich <dakr@...nel.org>, <dan.j.williams@...el.com>
CC: Jason Gunthorpe <jgg@...dia.com>, Laurent Pinchart
<laurent.pinchart@...asonboard.com>, Bartosz Golaszewski
<bartosz.golaszewski@....qualcomm.com>, Johan Hovold <johan@...nel.org>,
"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>, "Rafael J . Wysocki"
<rafael@...nel.org>, Tzung-Bi Shih <tzungbi@...nel.org>, Linus Walleij
<linusw@...nel.org>, Jonathan Corbet <corbet@....net>, Shuah Khan
<shuah@...nel.org>, Wolfram Sang <wsa+renesas@...g-engineering.com>, "Simona
Vetter" <simona.vetter@...ll.ch>, <linux-doc@...r.kernel.org>,
<linux-kselftest@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Bartosz
Golaszewski" <brgl@...nel.org>
Subject: Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
Danilo Krummrich wrote:
> On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote:
> > Jason Gunthorpe wrote:
> >> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> >> > > The latter already have robust schemes to help the driver shutdown and
> >> > > end the concurrent operations. ie cancel_work_sync(),
> >> > > del_timer_sync(), free_irq(), and *notifier_unregister().
> >> >
> >> > One a side note, devm_request_irq() is another of the devm_* helpers
> >> > that cause race conditions, as interrupt handlers can run right after
> >> > .remove() returns, which drivers will most likely not handle correctly.
> >>
> >> Yes! You *cannot* intermix devm and non-devm approaches without
> >> creating very subtle bugs exactly like this. If your subsystem does
> >> not provide a "devm register" helper its drivers shouldn't use devm.
> >
> > I wonder if we should have a proactive debug mode that checks for
> > idiomatic devres usage and flags:
> >
> > - registering devres actions while the driver is detached
>
> In Rust we already enforce this through the type system, i.e. we even fail to
> compile the code when this is violated. (I.e. for all scopes that guarantee that
> a device is bound (and will remain throughout) we give out a &Device<Bound>,
> which serves as a cookie.)
>
> In C I don't really see how that would be possible without additional locking,
> as the only thing we could check is dev->driver, which obviously is racy.
>
> > - registering devres actions for a device with a driver that has a
> > .remove() method
>
> This is perfectly valid, drivers might still be performing teardown operations
> on the device (if it did not get hot unplugged).
Yes, this one is a soft warning. It is perfectly valid, but the first
thing I look for in a device that uses devm in ->probe() and also has a
->remove() method is dependencies of the devm object on the ->remove()
managed object. That case is potentially mitigated if all resources are
devm acquired and no ->remove() is needed.
> > - passing a devres allocation to a kobject API
>
> Well, this is an ownership violation. Technically, devres owns this allocation
> and devres releases the allocation when the device is unbound. Which makes this
> allocation only ever valid to access from a scope that is guaranteed that the
> device is (and remains) bound.
To be clear I am talking about:
dev2 = devm_kzalloc(dev1...)
init(dev2)
device_register(dev2)
...where it must be valid past unbind of dev1.
> > - invoking devres release actions from a kobject release API
>
> This is similar to "registering devres actions while the driver is detached" and
> falls into the boarder problem class of "messing with devres objects outside of
> a Device<Bound> scope".
>
> Again, I think in C we can't properly protect against this. While we could also
> give out a "Device<Bound>" token for scopes where we can guarantee that the
> device is actually bound to a driver in C, we can't control the lifetime of the
> token as we can in Rust, which makes it pointless.
This is why Rust remains on my, learn when I have time, pile. The goal
would not be to "properly protect", but "sufficiently warn" the
unsuspecting. If anything is going to get me to convert a subsystem to
Rust it is to get help from the compiler for the review burden of the
above abuses.
> So, the best thing we can probably do is document that "This must only be called
> from a scope that guarantees that the device remains bound throughout." for all
> the devres accessors.
Agree.
> There may be one thing that comes to my mind that we could do though. While we
> can't catch those at compile time, we might be able to catch those on runtime.
>
> For instance, we could "abuse" lockdep and register a fake lock for a
> Device<Bound> scope and put a lockdep assert into all the devres accessors.
>
> However, fixing up all the violations that come up when introducing this sounds
> like a huge pain. :)
Right, and as you said there are many valid uses that are not typically
recommended that would make the warning not useful.
Powered by blists - more mailing lists