lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04594008-7b83-44bf-9e60-930a673dc2ec@nvidia.com>
Date: Thu, 13 Nov 2025 14:59:52 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: John Hubbard <jhubbard@...dia.com>, Danilo Krummrich <dakr@...nel.org>
Cc: Alexandre Courbot <acourbot@...dia.com>, Timur Tabi <ttabi@...dia.com>,
 Alistair Popple <apopple@...dia.com>, Edwin Peer <epeer@...dia.com>,
 Zhi Wang <zhiw@...dia.com>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>, Bjorn Helgaas <bhelgaas@...gle.com>,
 Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
 Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>,
 Benno Lossin <lossin@...nel.org>, Andreas Hindborg <a.hindborg@...nel.org>,
 Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
 nouveau@...ts.freedesktop.org, rust-for-linux@...r.kernel.org,
 LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 4/4] gpu: nova-core: add boot42 support for next-gen
 GPUs

Hi John,

On 11/11/2025 11:30 PM, John Hubbard wrote:
> NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
> architecture and revision details, and will instead use NV_PMC_BOOT_42
> in the future. NV_PMC_BOOT_0 will contain a specific set of values
> that will mean "go read NV_PMC_BOOT_42 instead".
> 
> Change the selection logic in Nova so that it will claim Turing and
> later GPUs. This will work for the foreseeable future, without any
> further code changes here, because all NVIDIA GPUs are considered, from
> the oldest supported on Linux (NV04), through the future GPUs.

[...]

> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index cd58040b681b..8c5f46f6aaac 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -175,19 +175,41 @@ pub(crate) struct Spec {
>  
>  impl Spec {
>      fn new(bar: &Bar0) -> Result<Spec> {
> +        // Some brief notes about boot0 and boot42, in chronological order:
> +        //
> +        // NV04 through NV50:
> +        //
> +        //    Not supported by Nova. boot0 is necessary and sufficient to identify these GPUs.
> +        //    boot42 may not even exist on some of these GPUs.
> +        //
> +        // Fermi through Volta:
> +        //
> +        //     Not supported by Nova. boot0 is still sufficient to identify these GPUs, but boot42
> +        //     is also guaranteed to be both present and accurate.
> +        //
> +        // Turing and later:
> +        //
> +        //     Supported by Nova. Identified by first checking boot0 to ensure that the GPU is not
> +        //     from an earlier (pre-Fermi) era, and then using boot42 to precisely identify the GPU.
> +        //     Somewhere in the Rubin timeframe, boot0 will no longer have space to add new GPU IDs.
> +
>          let boot0 = regs::NV_PMC_BOOT_0::read(bar);
>  
> -        Spec::try_from(boot0)
> +        if boot0.is_older_than_fermi() {
> +            return Err(ENOTSUPP);
> +        }
> +
> +        Spec::try_from(regs::NV_PMC_BOOT_42::read(bar))

There is an inconsistency in error return here, if NV04 through NV50, it returns
-ENOTSUPP. For Fermi through Volta, it will read boot42 but will return -ENODEV
because `Spec::try_from()` -> `boot42.chipset()` with return -ENODEV. I am Ok
with either error return, but it would be good to make it consistent.

There also does not seem to be a diagnostic if the chipset is not supported. It
would be good diagnostic that the chipset did not match, right now it will
return -ENODEV, which could mean the device does not exist. -ENOTSUPP is better
though but an actual dmesg error message would be nice.

With these,

Reviewed-by: Joel Fernandes <joelagnelf@...dia.com>

Thanks.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ