[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250226172120.GD28425@nvidia.com>
Date: Wed, 26 Feb 2025 13:21:20 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: Joel Fernandes <joelagnelf@...dia.com>,
Alexandre Courbot <acourbot@...dia.com>,
Dave Airlie <airlied@...il.com>, Gary Guo <gary@...yguo.net>,
Joel Fernandes <joel@...lfernandes.org>,
Boqun Feng <boqun.feng@...il.com>,
John Hubbard <jhubbard@...dia.com>, Ben Skeggs <bskeggs@...dia.com>,
linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
paulmck@...nel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice
implementation
On Wed, Feb 26, 2025 at 02:16:58AM +0100, Danilo Krummrich wrote:
> > DRM achieves this, in part, by using drm_dev_unplug().
>
> No, DRM can have concurrent driver code running, which is why drm_dev_enter()
> returns whether the device is unplugged already, such that subsequent
> operations, (e.g. I/O) can be omitted.
Ah, I did notice that the driver was the one providing the
file_operations struct so of course the core code can't protect the
driver ops. Yuk :\
> Again, the reason a pci::Bar needs to be revocable in Rust is that we can't have
> the driver potentially keep the pci::Bar alive (or even access it) after the
> device is unbound.
My impression is that nobody has yet come up with a Rust way to
implement the normal kernel design pattern of revoke threads then free
objects in safe rust.
Yes, this is a peculiar lifetime model, but it is pretty important in
the kernel. I'm not convinced you can just fully ignore it in Rust as
a design pattern. We use it pretty much everywhere a function pointer
is involved.
For instance, I'm looking at workqueue.rs and wondering why is it safe
against Execute After Free races. I see none of the C functions I
would expect to be used to prevent those races in the code.
Even the simple example:
//! fn print_later(val: Arc<MyStruct>) {
//! let _ = workqueue::system().enqueue(val);
//! }
Seems to be missing the EAF prevention ie WorkItem::run() is in .text
of THIS_MODULE and I see nothing is preventing THIS_MODULE from being
unloaded.
The expectation of work queues is to follow the above revoke threads
then free pattern. A module should do that sequence in the driver
remove() or module __exit function.
Jason
Powered by blists - more mailing lists