[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z75RvU-Yb70iFyor@cassiopeiae>
Date: Wed, 26 Feb 2025 00:26:53 +0100
From: Danilo Krummrich <dakr@...nel.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Joel Fernandes <joelagnelf@...dia.com>,
Alexandre Courbot <acourbot@...dia.com>,
Dave Airlie <airlied@...il.com>, Gary Guo <gary@...yguo.net>,
Joel Fernandes <joel@...lfernandes.org>,
Boqun Feng <boqun.feng@...il.com>,
John Hubbard <jhubbard@...dia.com>, Ben Skeggs <bskeggs@...dia.com>,
linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
paulmck@...nel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice
implementation
On Tue, Feb 25, 2025 at 06:57:56PM -0400, Jason Gunthorpe wrote:
> I think this resource-revoke idea is deviating from the normal
> expected driver model by allowing driver code to continue to run in
> other threads once remove completes. That is definitely abnormal at
> least.
No, it simply guarantees that once remove() completed the pointer to the
resource can't be accessed anymore and the resource can't be kept alive
(which includes the actual memory mapping as well as the allocated resource
region).
It also solves the unplug problem, where ioctls can't access the resource
anymore after remove(). This is indeed a problem that does not affect all
subsystems.
>
> It is not necessarily *wrong*, but it sure is weird and as I explained
> above it has bad system level properties.
>
> Further, it seems to me there is a very unique DRM specific issue at
> work "time unbounded driver callbacks". A weird solution to this
> should not be baked into the common core kernel rust bindings and
> break the working model of all other subsystems that don't have that
> problem.
>
> > > Similarly you can have custom functions for short sequences of I/O ops, or use
> > > closures. I don't understand the concern.
> >
> > Yeah, this is certainly possible. I think one concern is similar to what you
> > raised on the other thread you shared [1]:
> > "Maybe we even want to replace it with SRCU entirely to ensure that drivers
> > can't stall the RCU grace period for too long by accident."
>
> I'd be worried about introducing a whole bunch more untestable failure
> paths in drivers. Especially in areas like work queue submit that are
> designed not to fail [2]. Non-failing work queues is a critical property
> that I've relied on countless times. I'm not sure you even *can* recover
> from this correctly in all cases.
>
> Then in the other email did you say that even some memory allocations
> go into this scheme? Yikes!
"For instance, let's take devm_kzalloc(). Once the device is detached
from the driver the memory allocated with this function is freed automatically.
The additional step in Rust is, that we'd not only free the memory, but also
revoke the access to the pointer that has been returned by devm_kzalloc() for
the driver, such that it can't be used by accident anymore."
This was just an analogy to explain what we're doing here. Obviously, memory
allocations can be managed by Rust's object lifetime management.
The reason we have Devres for device resources is that the lifetime of a
pci::Bar is *not* bound to the object lifetime directly, but to the lifetime of
the binding between a device and a driver. That's why it needs to be revoked
(which forcefully drops the object) when the device is unbound *not* when the
pci::Bar object is dropped regularly.
That's all the magic we're doing here. And again, this is not a change to the
device / driver model. It is making use of the device / driver model to ensure
safety.
>
> Further, hiding a synchronize_rcu in a devm destructor [3], once per
> revocable object is awful. If you imagine having a rcu around each of
> your revocable objects, how many synchronize_rcu()s is devm going to
> call post-remove()?
As many as you have MMIO mappings in your driver. But we can probably optimize
this to just a single synchronize_rcu().
Powered by blists - more mailing lists