[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e249a1d1-f1fb-4818-b9d6-e0b5c17c61a5@kernel.org>
Date: Sat, 30 Aug 2025 14:56:58 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Alexandre Courbot <acourbot@...dia.com>
Cc: Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <lossin@...nel.org>, Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
John Hubbard <jhubbard@...dia.com>, Alistair Popple <apopple@...dia.com>,
Joel Fernandes <joelagnelf@...dia.com>, Timur Tabi <ttabi@...dia.com>,
rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org,
nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v2 5/8] gpu: nova-core: firmware: process and prepare the
GSP firmware
On 8/29/25 1:16 PM, Alexandre Courbot wrote:
> On Thu Aug 28, 2025 at 8:27 PM JST, Danilo Krummrich wrote:
>> On 8/26/25 6:07 AM, Alexandre Courbot wrote:
>>> /// Structure encapsulating the firmware blobs required for the GPU to operate.
>>> #[expect(dead_code)]
>>> pub(crate) struct Firmware {
>>> @@ -36,7 +123,10 @@ pub(crate) struct Firmware {
>>> booter_unloader: BooterFirmware,
>>> /// GSP bootloader, verifies the GSP firmware before loading and running it.
>>> gsp_bootloader: RiscvFirmware,
>>> - gsp: firmware::Firmware,
>>> + /// GSP firmware.
>>> + gsp: Pin<KBox<GspFirmware>>,
>>
>> Is there a reason why we don't just propagate it through struct Gpu, which uses
>> pin-init already?
>>
>> You can make Firmware pin_data too and then everything is within the single
>> allocation of struct Gpu.
Thanks a lot for the write-up below!
> I tried doing that at first, and hit the problem that the `impl PinInit`
> returned by `GspFirmware::new` borrows a reference to the GSP firmware
> binary loaded by `Firmware::new` - when `Firmware::new` returns, the
> firmware gets freed, and the borrow checker complains.
>
> We could move the GSP firmware loading code into the `pin_init!` of
> `Firmware::new`, but now we hit another problem: in `Gpu::new` the
> following code is executed:
>
> FbLayout::new(chipset, bar, &fw.gsp_bootloader, &fw.gsp)?
>
> which requires the `Firmware` instance, which doesn't exist yet as the
> `Gpu` object isn't initialized until the end of the method.
>
> So we could move `FbLayout`, and everything else created by `Gpu::new`
> to become members of the `Gpu` instance. It does make sense actually:
> this `new` method is doing a lot of stuff, such as running FRTS, and
> with Alistair's series it even runs Booter, the sequencer and so on.
> Maybe we should move all firmware execution to a separate method that is
> called by `probe` after the `Gpu` is constructed, as right now the `Gpu`
> constructor looks like it does a bit more than it should.
Absolutely, executing the firmware should be a separate method. Having it in the
constructor makes things more difficult.
> ... but even when doing that, `Firmware::new` and `FbLayout::new` still
> require a reference to the `Bar`, and... you get the idea. :)
Lifetime wise this should be fine, the &Bar out-lives the constructor, since
it's lifetime is bound to the &Device<Bound> which lives for the entire duration
of probe().
> So I don't think the current design allows us to do that easily or at
> all, and even if it does, it will be at a significant cost in code
> clarity. There is also the fact that I am considering making the
> firmware member of `Gpu` a trait object: the boot sequence is so
> different between pre and post-Hopper that I don't think it makes sense
> to share the same `Firmware` structure between the two. I would rather
> see `Firmware` as an opaque trait object, which provides high-level
> methods such as "start GSP" behind which the specifics of each GPU
> family are hidden. If we go with this design, `Firmware` will become a
> trait object and so cannot be pinned into `Gpu`.
>
> This doesn't change my observation that `Gpu::new` should not IMHO do
> steps like booting the GSP - it should just acquire the resources it
> needs, return the pinned GPU object, and then `probe` can continue the
> boot sequence. Having the GPU object pinned and constructed early
> simplifies things quite a bit as the more we progress with boot, the
> harder it becomes to construct everything in place (and the `PinInit`
> closure also becomes more and more complex).
>
> I'm still laying down the general design, but I'm pretty convinced that
> having `Firmware` as a trait object is the right way to abstract the
> differences between GPU families.
Makes sense, it's fine with me to keep this in its separate allocation for the
purpose of making Firmware an opaque trait object, which sounds reasonable.
But we should really properly separate construction of the GPU structure from
firmware boot code execution as you say. And actually move the construction of
the GPU object into try_pin_init!().
Powered by blists - more mailing lists