[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250619-nova-frts-v6-0-ecf41ef99252@nvidia.com>
Date: Thu, 19 Jun 2025 22:23:44 +0900
From: Alexandre Courbot <acourbot@...dia.com>
To: Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Andreas Hindborg <a.hindborg@...nel.org>, Alice Ryhl <aliceryhl@...gle.com>,
Trevor Gross <tmgross@...ch.edu>, Danilo Krummrich <dakr@...nel.org>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
Benno Lossin <lossin@...nel.org>
Cc: John Hubbard <jhubbard@...dia.com>, Ben Skeggs <bskeggs@...dia.com>,
Joel Fernandes <joelagnelf@...dia.com>, Timur Tabi <ttabi@...dia.com>,
Alistair Popple <apopple@...dia.com>, linux-kernel@...r.kernel.org,
rust-for-linux@...r.kernel.org, nouveau@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, Alexandre Courbot <acourbot@...dia.com>,
Benno Lossin <lossin@...nel.org>, Lyude Paul <lyude@...hat.com>,
Shirish Baskaran <sbaskaran@...dia.com>
Subject: [PATCH v6 00/24] nova-core: run FWSEC-FRTS to perform first stage
of GSP initialization
Hi everyone,
After discussion, and since the `num` module seems to be taking more
time to reach consensus than the rest of this series, I have split it
into its own patch series and use ad-hoc code in Nova (only a handful of
places thankfully) for now that will be replaced by the `num` patch
series. This will also allow it to maybe get more attention as it was
until now buried inside a loosely-related patch series.
This also includes an important fix for a bug discovered by Ben Skeggs
in the falcon code: the bit indicating the completion of memory
scrubbing was interpreted incorrectly, which created a race condition
that could result in a failure to boot the GSP. :O
Other than that, a few more minor refinements took place, but nothing
that changes this series considerably. The last patch tries to organize
the increasing number of TODO items we have in the code; until they can
be addressed, it would be nice to understand which task in `todo.rst`
they correspond to, so I took the freedom to annotate them all to that
effect.
Usual disclaimer: this series currently only successfully probes Ampere
GPUs, and does not allow the GPU to do anything useful yet. Upon
successful probe, the driver will only display the range of the WPR2
region constructed by FWSEC-FRTS with debug priority:
[ 95.436000] NovaCore 0000:01:00.0: WPR2: 0xffc00000-0xffce0000
[ 95.436002] NovaCore 0000:01:00.0: GPU instance built
This series is based on v6.16-rc1 with no other dependencies.
There are bits of documentation still missing, these are addressed by
Joel in his own documentation patch series [1]. I'll also double-check
and send follow-up patches if anything is still missing after that.
[1] https://lore.kernel.org/rust-for-linux/20250503040802.1411285-1-joelagnelf@nvidia.com/
Signed-off-by: Alexandre Courbot <acourbot@...dia.com>
---
Changes in v6:
- Add `dma_handle_with_offset` method to CoherentAllocation.
- Move the `num` module into its own patchset and use ad-hoc code for
now.
- Add new items (and remove obsolete ones) to the TODO tag `TODO`
entries in the code with their corresponding task in the list.
- Add `TIMEOUT:` comments wherever a timeout is used.
- Fix bug while waiting for falcon mem scrubbing to finish (thanks Ben
Skeggs!)
- Pass the firwmare object instead of its DMA handle in `dma_wr`.
- Fix safety statements in `fwsec.rs`.
- Move FWSEC boot code to `FwsecFirmware` and a helper function of
`Gpu` to simplify `Gpu::new`.
- Add helper methods to NV_PFB_PRI_MMU_WPR2_ADDR_* to obtain the exact
address.
- Fix build errors and warnings with Rust 1.78.
- Link to v5: https://lore.kernel.org/r/20250612-nova-frts-v5-0-14ba7eaf166b@nvidia.com
Changes in v5:
- Rebased on top of 6.16-rc1.
- Improve invariants of CoherentAllocation related to the new `size`
method.
- Use SZ_* consts when redefining BAR0 size.
- Split VBIOS patch into 3 patches (Joel)
- Convert all `Result<()>` into `Result`.
- Use `::cast<T>()` instead of ` as ` to convert pointer types.
- Use `KBox` instead of `Arc` for falcon HALs.
- Do not use `get_` prefix on methods that do not increase reference
count.
- Replace arbitrary immediate values with proper constants.
- Use EIO to indicate firmware errors.
- Use inspect_err to be more verbose on which step of the FWSEC setup
failed.
- Move sysmem flush page into its own type and add its registration to
the FB HAL.
- Turn HAL getters into standalone functions.
- Patch FWSEC command at construction time.
- Force the signing stage (or an explicit non-signing state transition)
on the firmware DMA objects.
- Link to v4: https://lore.kernel.org/r/20250521-nova-frts-v4-0-05dfd4f39479@nvidia.com
Changes in v4:
- Improve documentation of falcon security modes (thanks Joel!)
- Add the definition of the size of CoherentAllocation as one of its
invariants.
- Better document GFW boot progress, registers and use wait_on() helper,
and move it to `gfw` module instead of `devinit`.
- Add missing TODOs for workarounds waiting to be replaced by in-flight
R4L features.
- Register macro: add the offset of the register as a type constant, and
allow register aliases for registers which can be interpreted
differently depending on context.
- Rework the `num` module using only macros (to allow use of overflowing
ops), and add the `PowerOfTwo` type.
- Add a proper HAL to the `fb` module.
- Move HAL builders to impl blocks of Chipset.
- Add proper types and traits for signatures.
- Proactively split FalconFirmware into distinct traits to ease
management of v2 vs v3 FWSEC headers that will be needed for Turing
support.
- Link to v3:
https://lore.kernel.org/r/20250507-nova-frts-v3-0-fcb02749754d@nvidia.com
Changes in v3:
- Rebased on top of latest nova-next.
- Use the new Devres::access() and remove the now unneeded with_bar!()
macro.
- Dropped `rust: devres: allow to borrow a reference to the resource's
Device` as it is not needed anymore.
- Fixed more erroneous uses of `ERANGE` error.
- Optimized alignment computations of the FB layout a bit.
- Link to v2: https://lore.kernel.org/r/20250501-nova-frts-v2-0-b4a137175337@nvidia.com
Changes in v2:
- Rebased on latest nova-next.
- Fixed all clippy warnings.
- Added `count` and `size` methods to `CoherentAllocation`.
- Added method to obtain a reference to the `Device` from a `Devres`
(this is super convenient).
- Split `DmaObject` into its own patch and added `Deref` implementation.
- Squashed field names from [3] into "extract FWSEC from BIOS".
- Fixed erroneous use of `ERANGE` error.
- Reworked `register!()` macro towards a more intuitive syntax, moved
its helper macros into internal rules to avoid polluting the macro
namespace.
- Renamed all registers to capital snake case to better match OpenRM.
- Removed declarations for registers that are not used yet.
- Added more documentation for items not covered by Joel's documentation
patches.
- Removed timer device and replaced it with a helper function using
`Ktime`. This also made [4] unneeded so it is dropped.
- Unregister the sysmem flush page upon device destruction.
- ... probably more that I forgot. >_<
- Link to v1: https://lore.kernel.org/r/20250420-nova-frts-v1-0-ecd1cca23963@nvidia.com
[3] https://lore.kernel.org/all/20250423225405.139613-6-joelagnelf@nvidia.com/
[4] https://lore.kernel.org/lkml/20250420-nova-frts-v1-1-ecd1cca23963@nvidia.com/
---
Alexandre Courbot (21):
rust: dma: fix comment
rust: dma: expose the count and size of CoherentAllocation
rust: dma: add dma_handle_with_offset method to CoherentAllocation
rust: make ETIMEDOUT error available
rust: sizes: add constants up to SZ_2G
gpu: nova-core: use absolute paths in register!() macro
gpu: nova-core: add delimiter for helper rules in register!() macro
gpu: nova-core: expose the offset of each register as a type constant
gpu: nova-core: allow register aliases
gpu: nova-core: increase BAR0 size to 16MB
gpu: nova-core: add helper function to wait on condition
gpu: nova-core: wait for GFW_BOOT completion
gpu: nova-core: add DMA object struct
gpu: nova-core: register sysmem flush page
gpu: nova-core: add falcon register definitions and base code
gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS
gpu: nova-core: compute layout of the FRTS region
gpu: nova-core: add types for patching firmware binaries
gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
gpu: nova-core: load and run FWSEC-FRTS
gpu: nova-core: update and annotate TODO list
Joel Fernandes (3):
gpu: nova-core: vbios: Add base support for VBIOS construction and iteration
gpu: nova-core: vbios: Add support to look up PMU table in FWSEC
gpu: nova-core: vbios: Add support for FWSEC ucode extraction
Documentation/gpu/nova/core/todo.rst | 107 +--
drivers/gpu/nova-core/dma.rs | 58 ++
drivers/gpu/nova-core/driver.rs | 6 +-
drivers/gpu/nova-core/falcon.rs | 554 ++++++++++++++
drivers/gpu/nova-core/falcon/gsp.rs | 24 +
drivers/gpu/nova-core/falcon/hal.rs | 54 ++
drivers/gpu/nova-core/falcon/hal/ga102.rs | 119 +++
drivers/gpu/nova-core/falcon/sec2.rs | 10 +
drivers/gpu/nova-core/fb.rs | 136 ++++
drivers/gpu/nova-core/fb/hal.rs | 39 +
drivers/gpu/nova-core/fb/hal/ga100.rs | 57 ++
drivers/gpu/nova-core/fb/hal/ga102.rs | 36 +
drivers/gpu/nova-core/fb/hal/tu102.rs | 58 ++
drivers/gpu/nova-core/firmware.rs | 108 +++
drivers/gpu/nova-core/firmware/fwsec.rs | 423 +++++++++++
drivers/gpu/nova-core/gfw.rs | 41 +
drivers/gpu/nova-core/gpu.rs | 132 +++-
drivers/gpu/nova-core/nova_core.rs | 5 +
drivers/gpu/nova-core/regs.rs | 288 +++++++
drivers/gpu/nova-core/regs/macros.rs | 65 +-
drivers/gpu/nova-core/util.rs | 28 +
drivers/gpu/nova-core/vbios.rs | 1157 +++++++++++++++++++++++++++++
rust/kernel/dma.rs | 48 +-
rust/kernel/error.rs | 1 +
rust/kernel/sizes.rs | 24 +
25 files changed, 3504 insertions(+), 74 deletions(-)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250417-nova-frts-96ef299abe2c
Best regards,
--
Alexandre Courbot <acourbot@...dia.com>
Powered by blists - more mailing lists