[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2025032455-elk-whoops-3439@gregkh>
Date: Mon, 24 Mar 2025 06:54:47 -0700
From: Greg KH <gregkh@...uxfoundation.org>
To: Danilo Krummrich <dakr@...nel.org>
Cc: bhelgaas@...gle.com, rafael@...nel.org, ojeda@...nel.org,
alex.gaynor@...il.com, boqun.feng@...il.com, gary@...yguo.net,
bjorn3_gh@...tonmail.com, benno.lossin@...ton.me,
a.hindborg@...nel.org, aliceryhl@...gle.com, tmgross@...ch.edu,
linux-pci@...r.kernel.org, rust-for-linux@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/3] rust: pci: impl TryFrom<&Device> for &pci::Device
On Sun, Mar 23, 2025 at 11:10:27PM +0100, Danilo Krummrich wrote:
> On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> > On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> > > On Fri, Mar 21, 2025 at 10:47:54PM +0100, Danilo Krummrich wrote:
> > >
> > > Yes, over time, many subsystems were forced to come up with ways of
> > > determining the device type, look at dev_is_pci() and where it's used
> > > all over the place (wait, why can't you just use that here too?)
> >
> > It's a macro and bindgen doesn't pick it up for us to use.
> >
> > > But
> > > those usages are the rare exception. And I still think that was the
> > > wrong choice.
> > >
> > > What I don't want to see is this type of function to be the only way you
> > > can get a pci device from a struct device in rust.
> >
> > I see where you're coming from and I totally agree with that.
> >
> > The goal is to cover as much as possible by the corresponding abstractions, such
> > that the abstraction already provide the driver with the correct (device) type,
> > just like the bus callbacks do.
> >
> > We can do the same thing with IOCTLs, sysfs ops, etc. The miscdevice
> > abstractions are a good example for this. The DRM abstractions will also provide
> > the DRM device (typed to its corresponding driver) for all DRM IOCTLs a driver
> > registers.
> >
> > > That's going to be
> > > messy as that is going to be a very common occurance (maybe, I haven't
> > > seen how you use this), and would force everyone to always test the
> > > return value, adding complexity and additional work for everyone, just
> > > because the few wants it.
> >
> > Given the above, I think in most cases we are (and plan to be) a step ahead and
> > (almost) never have the driver in the situation that it ever needs any
> > container_of() like thing, because the abstraction already provides the correct
> > type.
> >
> > However, there may be some rare exceptions, where we need the driver to
> > "upcast", more below.
> >
> > > So where/how are you going to use this? Why can't you just store the
> > > "real" reference to the pci device? If you want to save off a generic
> > > device reference, where did it come from? Why can't you just explicitly
> > > save off the correct type at the time you stored it? Is this just
> > > attempting to save a few pointers? Is this going to be required to be
> > > called as the only way to get from a device to a pci device in a rust
> > > driver?
> >
> > As mentioned above, wherever the abstraction can already provide the correct
> > (device) type to the driver, that's what we should do instead.
> >
> > One specific use-case I see with this is for the auxiliary bus and likely MFD,
> > where we want to access the parent of a certain device.
> >
> > For instance, in nova-drm we'll probe against an auxiliary device registered by
> > nova-core. nova-drm will use its auxiliary device to call into nova-core.
> > nova-core can then validate that it is the originator of the auxiliary device
> > (e.g. by name and ID, or simply pointer comparison).
> >
> > Subsequently, nova-core can find its (in this case) pci::Device instance through
> > the parent device (Note that I changed this, such that you can only get the
> > parent device from an auxiliary::Device, as discussed).
> >
> > pub fn some_operation(adev: &auxiliary::Device, ...) -> ... {
> > // validate that `adev` was created by us
> >
> > if let Some(parent) = adev.parent() {
> > let pdev: &pci::Device = parent.try_into()?;
> > ...
> > }
> > }
> >
> > Now, I understand that your concern isn't necessarily about the fact that we
> > "upcast", but that it is fallible.
> >
> > And you're not wrong about this, nova-core knows for sure that the parent device
> > must be its PCI device, since it can indeed validate that the auxiliary device
> > was created and registered by itself.
> >
> > However, any other API that does not validate that `parent` is actually a
> > pci::Device in the example above would be fundamentally unsafe. And I think we
> > don't want to give out unsafe APIs to drivers.
> >
> > Given that this is only needed in very rare cases (since most of the time the
> > driver should be provided with the correct type directly) I'm not concerned
> > about the small overhead of the underlying pointer comparison. It's also not
> > creating messy code as it would be in C.
> >
> > In C it would be
> >
> > struct pci_dev *pdev;
> >
> > if (!dev_is_pci(parent))
> > return -EINVAL;
> >
> > pdev = to_pci_dev(parent);
> >
> > Whereas in Rust it's only
> >
> > let pdev: &pci::Device = parent.try_into()?;
> >
> > and without any validation in Rust
> >
> > // SAFETY: explain why this is safe to do
> > let pdev = unsafe { pci::Device::from_device(parent) };
> >
> > > Along these lines, if you can convince me that this is something that we
> > > really should be doing, in that we should always be checking every time
> > > someone would want to call to_pci_dev(), that the return value is
> > > checked, then why don't we also do this in C if it's going to be
> > > something to assure people it is going to be correct? I don't want to
> > > see the rust and C sides get "out of sync" here for things that can be
> > > kept in sync, as that reduces the mental load of all of us as we travers
> > > across the boundry for the next 20+ years.
> >
> > I think in this case it is good when the C and Rust side get a bit
> > "out of sync":
>
> A bit more clarification on this:
>
> What I want to say with this is, since we can cover a lot of the common cases
> through abstractions and the type system, we're left with the not so common
> ones, where the "upcasts" are not made in the context of common and well
> established patterns, but, for instance, depend on the semantics of the driver;
> those should not be unsafe IMHO.
>
> I think for C the situation is a bit different, because there we don't have a
> type system that can take care of the majority of cases (at compile time). Every
> such operation is fundamentally unsafe. So, it's a bit hard to draw a boundary.
>
> (In Rust, it's pretty simple to draw the boundary; if it can't be covered by the
> type system or by the abstraction, we need to check to uphold the safety
> guarantees.)
>
> So, for the majority of cases, since it's *always* an additional runtime check
> in C, I think we shouldn't have it. But probably there are rare cases where it
> also wouldn't be the worst idea to check. :)
Thanks for the long writeup, that makes more sense.
Ok, I'll not object to this change at all, so if you need them to go
through a different tree, feel free to take them. But as I think it's
the merge window now, it will have to wait till -rc1 is out.
And yes, maybe we do need a version of this for C code. :)
> In Rust we should be able to cover the majority of cases through abstractions
> and the type system, but in the cases where we can't do so, I don't think we
> should fall back to unsafe code for drivers.
>
> > For most of the cases where we can cover things through the type system and
> > already provide the driver with the correct object instance from the get-go,
> > that's a great improvement over C.
> >
> > For the very rare cases where we have to "upcast", I really think it's better to
> > uphold providing safe APIs to drivers, rather than unsafe ones.
Fair enough, hopefully this makes things simpler for driver authors in
the end as it is not a common case.
greg k-h
Powered by blists - more mailing lists