[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250228184013.GF39591@nvidia.com>
Date: Fri, 28 Feb 2025 14:40:13 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: John Hubbard <jhubbard@...dia.com>,
Greg KH <gregkh@...uxfoundation.org>,
Danilo Krummrich <dakr@...nel.org>,
Joel Fernandes <joelagnelf@...dia.com>,
Alexandre Courbot <acourbot@...dia.com>,
Dave Airlie <airlied@...il.com>, Gary Guo <gary@...yguo.net>,
Joel Fernandes <joel@...lfernandes.org>,
Boqun Feng <boqun.feng@...il.com>, Ben Skeggs <bskeggs@...dia.com>,
linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
paulmck@...nel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice
implementation
On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote:
> - Nuke the driver binding manually through sysfs with the unbind files.
> - Nuke all userspace that might beholding files and other resources open.
> - At this point the module refcount should be zero and you can unload it.
>
> Except developers really don't like the manual unbind step, and so we're
> missing try_module_get() in a bunch of places where it really should be.
IMHO they are not missing, we just have a general rule that if a
cleanup function, required to be called prior to module exit, revokes
any .text pointers then you don't need to hold the module refcount.
file_operations doesn't have such a cleanup function which is why it
takes the refcount.
hrtimer does have such a function which is why it doesn't take the
refcount.
> Now wrt why you can't just solve this all at the subsystem level and
> guarantee that after drm_dev_unplug no code is running in driver callbacks
> anymore:
>
> In really, really simple subsystems like backlight this is doable. In drm
> with arbitrary ioctl this isn't, and you get to make a choice:
It is certainly doable, you list the right way to do it right below
and RDMA implements that successfully.
The subsytem owns all FDs and proxies all file_opertions to the driver
(after improving them :) and that is protected by a rwsem/SRCU that
is safe against the removal path setting all driver ops to NULL.
> - You wait until all driver code finishes, which could be never if there's
> ioctl that wait for render to complete and don't handle hotunplug
> correctly. This is a deadlock.
Meh. We waited for all FDs to close for along time. It isn't a
"deadlock" it is just a wait on userspace that extends to module
unload. Very undesirable yes, but not the end of the world, it can
resolve itself if userspace shutsdown.
But, IMHO, the subsystem and driver should shoot down the waits during
remove.
Your infinite waits are all interruptable right? :)
> In my experience this is theorically possible, practically no one gets
> this right and defacto means that actual hotunplug under load has a good
> chance of just hanging forever. Which is why drm doesn't do this.
See, we didn't have this problem as we don't have infinite waits in
driver as part of the API. The API toward the driver is event driven..
I can understand that adding the shootdown logic all over the place
would be hard and you'd get it wrong.
But so is half removing the driver while it is doing *anything* and
trying to mitigate that with a different kind of hard to do locking
fix. *shrug*
> This is why I like the rust Revocable so much, because it's a normal rcu
> section, so disallows all sleeping. You might still deadlock on a busy
> loop waiting for hw without having a timeout. But that's generally
> fairly easy to spot, and good drivers have macros/helpers for this so
> that there is always a timeout.
The Recovable version narrows the critical sections to very small
regions, but having critical sections at all is still, IMHO, hacky.
What you should ask Rust to solve for you is the infinite waits! That
is the root cause of your problem. Compiler enforces no waits with out
a revocation option on DRM callbacks!
Wouldn't that be much better??
> drm_dev_unplug uses sleepable rcu for practicality reasons and so has a
> much, much higher chance of deadlocks. Note that strictly speaking
> drm_device should hold a module reference on the driver, but see above
> for why we don't have that - developers prefer convenience over
> correctness in this area.
Doesn't DRM have a module reference because the fops is in the driver
and the file core takes the driver module reference during
fops_get()/replace_fops() in drm_stub_open()? Or do I misunderstand
what that stub is for?
Like, I see a THIS_MODULE in driver->fops == amdgpu_driver_kms_fops ?
> We can and should definitely try to make this much better. I think we can
> get to full correctness wrt the first 3 lifetime things in rust. I'm not
> sure whether handling module unload/.text lifetime is worth the bother,
> it's probably only going to upset developers if we try.
It hurts to read a suggestion we should ignore .text lifetime rules :(
DRM can be be like this, but please don't push that mess onto the rest
of the world in the common rust bindings or common rust design
patterns. Especially after places have invested alot to properly and
fully fix these problems without EAF bugs, infinite wait problems or
otherwise.
My suggestion is that new DRM rust drivers should have the file
operations isolation like RDMA does and a design goal to have
revocable sleeps. No EAF issue. You don't have to fix the whole DRM
subsystem to get here, just some fairly small work that only new rust
drivers would use. Start off on a good foot. <shrug>
Jason
Powered by blists - more mailing lists