lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DCHAPJRPKSSA.37QLQGAVCERCZ@nvidia.com>
Date: Mon, 01 Sep 2025 16:46:23 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Alistair Popple" <apopple@...dia.com>,
 <dri-devel@...ts.freedesktop.org>, <dakr@...nel.org>
Cc: "Miguel Ojeda" <ojeda@...nel.org>, "Alex Gaynor"
 <alex.gaynor@...il.com>, "Boqun Feng" <boqun.feng@...il.com>, "Gary Guo"
 <gary@...yguo.net>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, "Benno Lossin" <lossin@...nel.org>, "Andreas
 Hindborg" <a.hindborg@...nel.org>, "Alice Ryhl" <aliceryhl@...gle.com>,
 "Trevor Gross" <tmgross@...ch.edu>, "David Airlie" <airlied@...il.com>,
 "Simona Vetter" <simona@...ll.ch>, "Maarten Lankhorst"
 <maarten.lankhorst@...ux.intel.com>, "Maxime Ripard" <mripard@...nel.org>,
 "Thomas Zimmermann" <tzimmermann@...e.de>, "John Hubbard"
 <jhubbard@...dia.com>, "Joel Fernandes" <joelagnelf@...dia.com>, "Timur
 Tabi" <ttabi@...dia.com>, <linux-kernel@...r.kernel.org>,
 <nouveau@...ts.freedesktop.org>, "Nouveau"
 <nouveau-bounces@...ts.freedesktop.org>
Subject: Re: [PATCH 03/10] gpu: nova-core: gsp: Create wpr metadata

Hi Alistair,

On Wed Aug 27, 2025 at 5:20 PM JST, Alistair Popple wrote:
<snip>
> index 161c057350622..1f51e354b9569 100644
> --- a/drivers/gpu/nova-core/gsp.rs
> +++ b/drivers/gpu/nova-core/gsp.rs
> @@ -6,12 +6,17 @@
>  use kernel::dma_write;
>  use kernel::pci;
>  use kernel::prelude::*;
> -use kernel::ptr::Alignment;
> +use kernel::ptr::{Alignable, Alignment};
> +use kernel::sizes::SZ_128K;
>  use kernel::transmute::{AsBytes, FromBytes};
>  
> +use crate::fb::FbLayout;
> +use crate::firmware::Firmware;
>  use crate::nvfw::{
> -    LibosMemoryRegionInitArgument, LibosMemoryRegionKind_LIBOS_MEMORY_REGION_CONTIGUOUS,
> -    LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM,
> +    GspFwWprMeta, GspFwWprMetaBootInfo, GspFwWprMetaBootResumeInfo, LibosMemoryRegionInitArgument,
> +    LibosMemoryRegionKind_LIBOS_MEMORY_REGION_CONTIGUOUS,
> +    LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM, GSP_FW_WPR_META_MAGIC,
> +    GSP_FW_WPR_META_REVISION,
>  };
>  
>  pub(crate) const GSP_PAGE_SHIFT: usize = 12;
> @@ -25,12 +30,69 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
>  // are valid.
>  unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
>  
> +// SAFETY: Padding is explicit and will not contain uninitialized data.
> +unsafe impl AsBytes for GspFwWprMeta {}
> +
> +// SAFETY: This struct only contains integer types for which all bit patterns
> +// are valid.
> +unsafe impl FromBytes for GspFwWprMeta {}
> +
>  #[allow(unused)]
>  pub(crate) struct GspMemObjects {
>      libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
>      pub loginit: CoherentAllocation<u8>,
>      pub logintr: CoherentAllocation<u8>,
>      pub logrm: CoherentAllocation<u8>,
> +    pub wpr_meta: CoherentAllocation<GspFwWprMeta>,
> +}

I think `wpr_meta` is a bit out-of-place in this structure. There are
several reason for this:

- All the other members of this structure (including `cmdq` which is
  added later) are referenced by `libos` and constitute the GSP runtime:
  they are used as long as the GSP is active. `wpr_meta`, OTOH, does not
  reference any of the other objects, nor is it referenced by them.
- `wpr_meta` is never used by the GSP, but needed as a parameter of
  Booter on SEC2 to load the GSP firmware. It can actually be discarded
  once this step is completed. This is very different from the rest of
  this structure, which is used by the GSP.

So I think it doesn't really belong here, and would probably fit better
in `Firmware`. Now the fault lies in my own series, which doesn't let
you build `wpr_meta` easily from there. I'll try to fix that in the v3.

And with the removal of `wpr_meta`, this structure ends up strictly
containing the GSP runtime, including the command queue... Maybe it can
simply be named `Gsp` then? It is even already in the right module! :)

Loosely related, but looking at this series made me realize there is a
very logical split of our firmware into two "bundles":

- The GSP bundle includes the GSP runtime data, which is this
  `GspMemObjects` structure minus `wpr_meta`. We pass it as an input
  parameter to the GSP firmware using the GSP's falcon mbox registers.
  It must live as long as the GSP is running.
- The SEC2 bundle includes Booter, `wpr_meta`, the GSP firmware binary,
  bootloader and its signatures (which are all referenced by
  `wpr_meta`). All this data is consumed by SEC2, and crucially can be
  dropped once the GSP is booted.

This separation is important as currently we are stuffing anything
firmware-related into the `Firmware` struct and keep it there forever,
consuming dozens of megabytes of host memory that we could free. Booting
the GSP is typically a one-time operation in the life of the GPU device,
and even if we ever need to do it again, we can very well build the SEC2
bundle from scratch again.

I will try to reflect the separation better in the v3 of my patchset -
then we can just build `wpr_meta` as a local variable of the method that
runs `Booter`, and drop it (alongside the rest of the SEC2 bundle) upon
return.

> +
> +pub(crate) fn build_wpr_meta(
> +    dev: &device::Device<device::Bound>,
> +    fw: &Firmware,
> +    fb_layout: &FbLayout,
> +) -> Result<CoherentAllocation<GspFwWprMeta>> {
> +    let wpr_meta =
> +        CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
> +    dma_write!(
> +        wpr_meta[0] = GspFwWprMeta {
> +            magic: GSP_FW_WPR_META_MAGIC as u64,
> +            revision: u64::from(GSP_FW_WPR_META_REVISION),
> +            sysmemAddrOfRadix3Elf: fw.gsp.lvl0_dma_handle(),
> +            sizeOfRadix3Elf: fw.gsp.size as u64,
> +            sysmemAddrOfBootloader: fw.gsp_bootloader.ucode.dma_handle(),
> +            sizeOfBootloader: fw.gsp_bootloader.ucode.size() as u64,
> +            bootloaderCodeOffset: u64::from(fw.gsp_bootloader.code_offset),
> +            bootloaderDataOffset: u64::from(fw.gsp_bootloader.data_offset),
> +            bootloaderManifestOffset: u64::from(fw.gsp_bootloader.manifest_offset),
> +            __bindgen_anon_1: GspFwWprMetaBootResumeInfo {
> +                __bindgen_anon_1: GspFwWprMetaBootInfo {
> +                    sysmemAddrOfSignature: fw.gsp_sigs.dma_handle(),
> +                    sizeOfSignature: fw.gsp_sigs.size() as u64,
> +                }
> +            },
> +            gspFwRsvdStart: fb_layout.heap.start,
> +            nonWprHeapOffset: fb_layout.heap.start,
> +            nonWprHeapSize: fb_layout.heap.end - fb_layout.heap.start,
> +            gspFwWprStart: fb_layout.wpr2.start,
> +            gspFwHeapOffset: fb_layout.wpr2_heap.start,
> +            gspFwHeapSize: fb_layout.wpr2_heap.end - fb_layout.wpr2_heap.start,
> +            gspFwOffset: fb_layout.elf.start,
> +            bootBinOffset: fb_layout.boot.start,
> +            frtsOffset: fb_layout.frts.start,
> +            frtsSize: fb_layout.frts.end - fb_layout.frts.start,
> +            gspFwWprEnd: fb_layout
> +                .vga_workspace
> +                .start
> +                .align_down(Alignment::new(SZ_128K)),
> +            gspFwHeapVfPartitionCount: fb_layout.vf_partition_count,
> +            fbSize: fb_layout.fb.end - fb_layout.fb.start,
> +            vgaWorkspaceOffset: fb_layout.vga_workspace.start,
> +            vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
> +            ..Default::default()
> +        }
> +    )?;
> +
> +    Ok(wpr_meta)

I've discussed the bindings abstractions with Danilo last week. We
agreed that no layout information should ever escape the `nvfw` module.
I.e. the fields of `GspFwWprMeta` should not even be visible here.

Instead, `GspFwWprMeta` should be wrapped privately into another
structure inside `nvfw`:

  /// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
  /// addresses of the GSP bootloader and firmware.
  #[repr(transparent)]
  pub(crate) struct GspFwWprMeta(r570_144::GspFwWprMeta);

All its implementations should also be there:

  // SAFETY: Padding is explicit and will not contain uninitialized data.
  unsafe impl AsBytes for GspFwWprMeta {}

  // SAFETY: This struct only contains integer types for which all bit patterns
  // are valid.
  unsafe impl FromBytes for GspFwWprMeta {}

And lastly, this `new` method can also be moved into `nvfw`, as an impl
block for the wrapping `GspFwWprMeta` type. That way no layout detail
escapes that module, and it will be easier to adapt the code to
potential layout chances with new firmware versions.

(note that my series is the one carelessly re-exporting `GspFwWprMeta`
as-is - I'll fix that too in v3)

The same applies to `LibosMemoryRegionInitArgument` of the previous
patch, and other types introduced in subsequent patches. Usually there
is little more work to do than moving the implentations into `nvfw` as
everything is already abstracted correctly - just not where we
eventually want it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ