[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DG0ZES9SRSKD.11UBH7B75WNSX@kernel.org>
Date: Thu, 29 Jan 2026 10:56:22 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: <dan.j.williams@...el.com>
Cc: "Jason Gunthorpe" <jgg@...dia.com>, "Laurent Pinchart"
<laurent.pinchart@...asonboard.com>, "Bartosz Golaszewski"
<bartosz.golaszewski@....qualcomm.com>, "Johan Hovold" <johan@...nel.org>,
"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>, "Rafael J . Wysocki"
<rafael@...nel.org>, "Tzung-Bi Shih" <tzungbi@...nel.org>, "Linus Walleij"
<linusw@...nel.org>, "Jonathan Corbet" <corbet@....net>, "Shuah Khan"
<shuah@...nel.org>, "Wolfram Sang" <wsa+renesas@...g-engineering.com>,
"Simona Vetter" <simona.vetter@...ll.ch>, <linux-doc@...r.kernel.org>,
<linux-kselftest@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Bartosz
Golaszewski" <brgl@...nel.org>
Subject: Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote:
> Jason Gunthorpe wrote:
>> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
>> > > The latter already have robust schemes to help the driver shutdown and
>> > > end the concurrent operations. ie cancel_work_sync(),
>> > > del_timer_sync(), free_irq(), and *notifier_unregister().
>> >
>> > One a side note, devm_request_irq() is another of the devm_* helpers
>> > that cause race conditions, as interrupt handlers can run right after
>> > .remove() returns, which drivers will most likely not handle correctly.
>>
>> Yes! You *cannot* intermix devm and non-devm approaches without
>> creating very subtle bugs exactly like this. If your subsystem does
>> not provide a "devm register" helper its drivers shouldn't use devm.
>
> I wonder if we should have a proactive debug mode that checks for
> idiomatic devres usage and flags:
>
> - registering devres actions while the driver is detached
In Rust we already enforce this through the type system, i.e. we even fail to
compile the code when this is violated. (I.e. for all scopes that guarantee that
a device is bound (and will remain throughout) we give out a &Device<Bound>,
which serves as a cookie.)
In C I don't really see how that would be possible without additional locking,
as the only thing we could check is dev->driver, which obviously is racy.
> - registering devres actions for a device with a driver that has a
> .remove() method
This is perfectly valid, drivers might still be performing teardown operations
on the device (if it did not get hot unplugged).
> - passing a devres allocation to a kobject API
Well, this is an ownership violation. Technically, devres owns this allocation
and devres releases the allocation when the device is unbound. Which makes this
allocation only ever valid to access from a scope that is guaranteed that the
device is (and remains) bound.
> - invoking devres release actions from a kobject release API
This is similar to "registering devres actions while the driver is detached" and
falls into the boarder problem class of "messing with devres objects outside of
a Device<Bound> scope".
Again, I think in C we can't properly protect against this. While we could also
give out a "Device<Bound>" token for scopes where we can guarantee that the
device is actually bound to a driver in C, we can't control the lifetime of the
token as we can in Rust, which makes it pointless.
So, the best thing we can probably do is document that "This must only be called
from a scope that guarantees that the device remains bound throughout." for all
the devres accessors.
There may be one thing that comes to my mind that we could do though. While we
can't catch those at compile time, we might be able to catch those on runtime.
For instance, we could "abuse" lockdep and register a fake lock for a
Device<Bound> scope and put a lockdep assert into all the devres accessors.
However, fixing up all the violations that come up when introducing this sounds
like a huge pain. :)
Powered by blists - more mailing lists