[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251209160515.6658881a@inno-ThinkPad-X280>
Date: Tue, 9 Dec 2025 16:05:15 +0200
From: Zhi Wang <zhiw@...dia.com>
To: Joel Fernandes <joelagnelf@...dia.com>
CC: <rust-for-linux@...r.kernel.org>, <linux-pci@...r.kernel.org>,
<nouveau@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>,
<airlied@...il.com>, <dakr@...nel.org>, <aliceryhl@...gle.com>,
<bhelgaas@...gle.com>, <kwilczynski@...nel.org>, <ojeda@...nel.org>,
<alex.gaynor@...il.com>, <boqun.feng@...il.com>, <gary@...yguo.net>,
<bjorn3_gh@...tonmail.com>, <lossin@...nel.org>, <a.hindborg@...nel.org>,
<tmgross@...ch.edu>, <markus.probst@...teo.de>, <helgaas@...nel.org>,
<cjia@...dia.com>, <alex@...zbot.org>, <smitra@...dia.com>,
<ankita@...dia.com>, <aniketa@...dia.com>, <kwankhede@...dia.com>,
<targupta@...dia.com>, <acourbot@...dia.com>, <jhubbard@...dia.com>,
<zhiwang@...nel.org>
Subject: Re: [RFC 7/7] gpu: nova-core: load the scrubber ucode when vGPU
support is enabled
On Sat, 6 Dec 2025 21:26:12 -0500
Joel Fernandes <joelagnelf@...dia.com> wrote:
> Hi Zhi,
>
> On 12/6/2025 7:42 AM, Zhi Wang wrote:
snip
>
> boot() already returns -ETIMEDOUT via
> wait_till_halted()->read_poll_timeout().
>
> The wait there is 2 seconds. I assume the scrubber would have
> completed by then.
> 1
> > +
> > + dev_dbg!(
> > + pdev.as_ref(),
> > + "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
> > + mbox0,
> > + mbox1
> > + );
> > +
> > + if
> > !regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()
> > {
> > + return Err(ETIMEDOUT);
>
> So under which situation do you get to this point
> (!scrubber_completed) ? Basically I am not sure if ETIMEDOUT is the
> right error to return here, because boot() already returns ETIMEDOUT
> by waiting for the halt.
>
> If you still want return ETIMEDOUT here, then it sounds like you're
> waiting for scrubbing beyond the waiting already done by boot(). If
> so, then shouldn't you need to use read_poll_timeout() here?
>
> perhaps something like:
>
> read_poll_timeout(
> ||
> Ok(regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()),
> |val: &bool| *val, Delta::from_millis(10),
> Delta::from_secs(5),
> )?;
>
This is the identical implementation to OpenRM [1]. According to that
parts of code, I think the scrubber runs in the binary booting process.
When it signals the firmware booting successfully, the scrubbing should
be done. Let me change to another errno.
[1]https://github.com/NVIDIA/open-gpu-kernel-modules/blob/a5bfb10e75a4046c5d991c65f49b5d29151e68cf/src/nvidia/src/kernel/gpu/gsp/arch/ada/kernel_gsp_ad102.c#L49
> Thanks.
>
Powered by blists - more mailing lists