lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D9D6UCL4PMJY.1MBEFIIHWT5F5@nvidia.com>
Date: Tue, 22 Apr 2025 22:06:00 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Danilo Krummrich" <dakr@...nel.org>, "Joel Fernandes"
 <joelagnelf@...dia.com>
Cc: "Miguel Ojeda" <ojeda@...nel.org>, "Alex Gaynor"
 <alex.gaynor@...il.com>, "Boqun Feng" <boqun.feng@...il.com>, "Gary Guo"
 <gary@...yguo.net>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, "Benno Lossin" <benno.lossin@...ton.me>,
 "Andreas Hindborg" <a.hindborg@...nel.org>, "Alice Ryhl"
 <aliceryhl@...gle.com>, "Trevor Gross" <tmgross@...ch.edu>, "David Airlie"
 <airlied@...il.com>, "Simona Vetter" <simona@...ll.ch>, "Maarten Lankhorst"
 <maarten.lankhorst@...ux.intel.com>, "Maxime Ripard" <mripard@...nel.org>,
 "Thomas Zimmermann" <tzimmermann@...e.de>, "Jonathan Corbet"
 <corbet@....net>, "John Hubbard" <jhubbard@...dia.com>, "Ben Skeggs"
 <bskeggs@...dia.com>, "Timur Tabi" <ttabi@...dia.com>, "Alistair Popple"
 <apopple@...dia.com>, <linux-kernel@...r.kernel.org>,
 <rust-for-linux@...r.kernel.org>, <nouveau@...ts.freedesktop.org>,
 <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH 08/16] gpu: nova-core: wait for GFW_BOOT completion

Hi Joel, Danilo,

On Tue Apr 22, 2025 at 8:28 PM JST, Danilo Krummrich wrote:
> On Mon, Apr 21, 2025 at 05:45:33PM -0400, Joel Fernandes wrote:
>> On 4/20/2025 8:19 AM, Alexandre Courbot wrote:
>> > diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs
>> > new file mode 100644
>> > index 0000000000000000000000000000000000000000..ee5685aff845aa97d6b0fbe9528df9a7ba274b2c
>> > --- /dev/null
>> > +++ b/drivers/gpu/nova-core/devinit.rs
>> > @@ -0,0 +1,40 @@
>> > +// SPDX-License-Identifier: GPL-2.0
>> > +
>> > +//! Methods for device initialization.
>> 
>> Let us clarify what devinit means.
>> 
>> devinit is a sequence of register read/writes after reset that performs tasks
>> such as:
>> 1. Programming VRAM memory controller timings.
>> 2. Power sequencing.
>> 3. Clock and PLL configuration.
>> 4. Thermal management.
>> 5. Performs VRAM memory scrubbing (ECC initialization) - on some GPUs, it scrubs
>> only part of memory and then kicks of 'async scrubbing'.
>> 
>> devinit itself is a 'script' which is interpreted by the PMU microcontroller of
>> of the GPU by an interpreter program.
>> 
>> Note that devinit also needs to run during suspend/resume at runtime.
>
> Thanks for writing this up. I fully agree that those things have to be
> documented.
>
>> 
>> I talked with Alex and I could add a new patch on top of this patch to add these
>> clarifying 'doc' comments as well. I will commit them to my git branch and send
>> on top of this as needed, but Alex can feel free to decide to squash them as well.
>
> Fine with both, whatever you guys prefer.

If that works with you, I will put Joel's patches improving the
documentation right after mines adding the code in the next revision. I
know this ideally should be a single patch, but researching this stuff
(and producing a proper writeup) is quite involved and a separate kind
of task from the quickly-translate-code-while-peeking-at-OpenRM work
that I did. 

>
>> 
>> > +
>> > +use kernel::bindings;
>> > +use kernel::devres::Devres;
>> > +use kernel::prelude::*;
>> > +
>> > +use crate::driver::Bar0;
>> > +use crate::regs;
>> > +
>> > +/// Wait for devinit FW completion.
>> > +///
>> > +/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is
>> > +/// considered unusable until this step is completed, so it must be waited on very early during
>> > +/// driver initialization.
>> > +pub(crate) fn wait_gfw_boot_completion(bar: &Devres<Bar0>) -> Result<()> {
>> 
>> To reduce acronym soup, we can clarify gfw means 'GPU firmware', it is a broad
>> term used for VBIOS ROM components several of which execute before the driver
>> loads. Perhaps that part of comment can be 'the GPU firmware (gfw) code'.
>
> Yes, we should absolutely explain acronyms as well as use consistent and defined
> terminology when referring to things.
>
> I think we should put both into Documentation/gpu/nova/ and add the
> corresponding pointers in the code.

SGTM.

>
>> I find this Rust convention for camel casing long constants very unreadable and
>> troubling: Pgc6AonSecureScratchGroup05. I think we should relax this requirement
>> for sake of readability. Could the Rust community / maintainers provide some input?
>> 
>> Apart from readability, it also makes searching for the same register name a
>> nightmare with other code bases written in C.
>> 
>> Couple of ideas discussed:
>> 
>> 1. May be have a macro that converts
>> REG(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK) ->
>> regs::Pgc6AonSecureScratchGroup05 ?
>> But not sure what it takes on the rust side to implement a macro like that.
>> 
>> 2. Adding doc comments both in regs.rs during defining the register, and
>> possibly at the caller site. This still does address the issue fully.
>
> If that addresses your concern, it sounds totally reasonable to me.

Sorry, I'm having trouble understanding what you guys are agreeing on.
:)

The most radical option would be to define the registers directly as
capital snake-case (NV_PGC6_...), basically a 1:1 match with OpenRM.
This would be the easiest way to cross-reference, but goes against the
Rust naming conventions. If we go all the way, this also means the field
accessors would be capital snake-case, unless we figure out a smart
macro to work this around...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ