[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DC1PB630413R.33T95R794VWMC@kernel.org>
Date: Thu, 14 Aug 2025 01:50:14 +0200
From: "Danilo Krummrich" <dakr@...nel.org>
To: "John Hubbard" <jhubbard@...dia.com>
Cc: "Alexandre Courbot" <acourbot@...dia.com>, "Joel Fernandes"
<joelagnelf@...dia.com>, "Timur Tabi" <ttabi@...dia.com>, "Alistair Popple"
<apopple@...dia.com>, "David Airlie" <airlied@...il.com>, "Simona Vetter"
<simona@...ll.ch>, "Bjorn Helgaas" <bhelgaas@...gle.com>,
Krzysztof Wilczyński <kwilczynski@...nel.org>, "Miguel
Ojeda" <ojeda@...nel.org>, "Alex Gaynor" <alex.gaynor@...il.com>, "Boqun
Feng" <boqun.feng@...il.com>, "Gary Guo" <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>, "Benno Lossin"
<lossin@...nel.org>, "Andreas Hindborg" <a.hindborg@...nel.org>, "Alice
Ryhl" <aliceryhl@...gle.com>, "Trevor Gross" <tmgross@...ch.edu>,
<nouveau@...ts.freedesktop.org>, <linux-pci@...r.kernel.org>,
<rust-for-linux@...r.kernel.org>, "LKML" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] gpu: nova-core: avoid probing non-display/compute PCI
functions
On Thu Aug 14, 2025 at 1:28 AM CEST, John Hubbard wrote:
> NovaCore 0000:c1:00.0: GPU instance built
> NovaCore 0000:c1:00.1: Probe Nova Core GPU driver.
> NovaCore 0000:c1:00.1: enabling device (0000 -> 0002)
> NovaCore 0000:c1:00.1: probe with driver NovaCore failed with error -22
> ...
> Bad IO access at port 0x0 ()
> WARNING: CPU: 26 PID: 748 at lib/iomap.c:45 pci_iounmap+0x3f/0x50
> ...
> <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x2c/0x70 [nova_core]
> devres_release_all+0xa8/0xf0
> really_probe+0x30f/0x420
> __driver_probe_device+0x77/0xf0
> driver_probe_device+0x22/0x1b0
> __driver_attach+0x118/0x250
> bus_for_each_dev+0x105/0x130
> bus_add_driver+0x163/0x2a0
> driver_register+0x5d/0xf0
> init_module+0x6d/0x1000 [nova_core]
> do_one_initcall+0xde/0x380
> do_init_module+0x60/0x250
>
> ...and then:
> BUG: kernel NULL pointer dereference, address: 0000000000000538
> RIP: 0010:pci_release_region+0x10/0x60
> ...
> <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x36/0x70 [nova_core]
> devres_release_all+0xa8/0xf0
> really_probe+0x30f/0x420
> __driver_probe_device+0x77/0xf0
> driver_probe_device+0x22/0x1b0
> __driver_attach+0x118/0x250
> bus_for_each_dev+0x105/0x130
> bus_add_driver+0x163/0x2a0
> driver_register+0x5d/0xf0
> init_module+0x6d/0x1000 [nova_core]
> do_one_initcall+0xde/0x380
> do_init_module+0x60/0x250
This is caused by a bug in Devres, which I already fixed in [1].
With the patch in [1] nova-core should gracefully fail probing for the
non-supported device classes as expected.
However, I think we still want to filter by PCI class, so the patch is fine in
general. :)
Few comments below.
[1] https://lore.kernel.org/lkml/20250812130928.11075-1-dakr@kernel.org/
>
> Signed-off-by: John Hubbard <jhubbard@...dia.com>
> ---
> drivers/gpu/nova-core/driver.rs | 13 +++++++++++++
> rust/kernel/pci.rs | 6 ++++++
> 2 files changed, 19 insertions(+)
>
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index 274989ea1fb4..4e0e6f5338e9 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -31,6 +31,19 @@ impl pci::Driver for NovaCore {
> fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>> {
> dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n");
>
> + let class_code = pdev.class();
> +
> + if class_code != bindings::PCI_CLASS_DISPLAY_VGA
> + && class_code != bindings::PCI_CLASS_DISPLAY_3D
I think it would be nice if we could provide a Rust enum for PCI classes, such
that this could be pci::Class::DISPLAY_VGA instead.
Of course the same is true for PCI (sub)vendor, (sub)device IDs.
> + {
> + dev_dbg!(
> + pdev.as_ref(),
> + "Skipping non-display NVIDIA device with class 0x{:04x}\n",
> + class_code
> + );
> + return Err(kernel::error::code::ENODEV);
With the prelude included you should be able to use ENODEV directly.
> + }
> +
> pdev.enable_device_mem()?;
> pdev.set_master();
>
> diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
Please split the PCI part up into a separate patch.
> index 887ee611b553..b6416fe7bdfd 100644
> --- a/rust/kernel/pci.rs
> +++ b/rust/kernel/pci.rs
> @@ -399,6 +399,12 @@ pub fn device_id(&self) -> u16 {
> unsafe { (*self.as_raw()).device }
> }
>
> + /// Returns the PCI class code (class and subclass).
> + pub fn class(&self) -> u32 {
> + // SAFETY: `self.as_raw` is a valid pointer to a `struct pci_dev`.
> + unsafe { (*self.as_raw()).class >> 8 }
> + }
> +
> /// Returns the size of the given PCI bar resource.
> pub fn resource_len(&self, bar: u32) -> Result<bindings::resource_size_t> {
> if !Bar::index_is_valid(bar) {
>
> base-commit: dfc0f6373094dd88e1eaf76c44f2ff01b65db851
> --
> 2.50.1
Powered by blists - more mailing lists