linux-kernel - Re: [RFC 7/7] gpu: nova-core: load the scrubber ucode when vGPU support is enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <E8245EE2-887A-447A-8576-DC845FD57DC1@nvidia.com>
Date: Thu, 11 Dec 2025 01:24:49 +0000
From: Joel Fernandes <joelagnelf@...dia.com>
To: Zhi Wang <zhiw@...dia.com>
CC: "rust-for-linux@...r.kernel.org" <rust-for-linux@...r.kernel.org>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"nouveau@...ts.freedesktop.org" <nouveau@...ts.freedesktop.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"airlied@...il.com" <airlied@...il.com>, "dakr@...nel.org" <dakr@...nel.org>,
	"aliceryhl@...gle.com" <aliceryhl@...gle.com>, "bhelgaas@...gle.com"
	<bhelgaas@...gle.com>, "kwilczynski@...nel.org" <kwilczynski@...nel.org>,
	"ojeda@...nel.org" <ojeda@...nel.org>, "alex.gaynor@...il.com"
	<alex.gaynor@...il.com>, "boqun.feng@...il.com" <boqun.feng@...il.com>,
	"gary@...yguo.net" <gary@...yguo.net>, "bjorn3_gh@...tonmail.com"
	<bjorn3_gh@...tonmail.com>, "lossin@...nel.org" <lossin@...nel.org>,
	"a.hindborg@...nel.org" <a.hindborg@...nel.org>, "tmgross@...ch.edu"
	<tmgross@...ch.edu>, "markus.probst@...teo.de" <markus.probst@...teo.de>,
	"helgaas@...nel.org" <helgaas@...nel.org>, Neo Jia <cjia@...dia.com>,
	"alex@...zbot.org" <alex@...zbot.org>, Surath Mitra <smitra@...dia.com>,
	Ankit Agrawal <ankita@...dia.com>, Aniket Agashe <aniketa@...dia.com>, Kirti
 Wankhede <kwankhede@...dia.com>, "Tarun Gupta (SW-GPU)"
	<targupta@...dia.com>, Alexandre Courbot <acourbot@...dia.com>, John Hubbard
	<jhubbard@...dia.com>, "zhiwang@...nel.org" <zhiwang@...nel.org>
Subject: Re: [RFC 7/7] gpu: nova-core: load the scrubber ucode when vGPU
 support is enabled


> On Dec 9, 2025, at 11:05 PM, Zhi Wang <zhiw@...dia.com> wrote:
> [..]
>>> +
>>> +            dev_dbg!(
>>> +                pdev.as_ref(),
>>> +                "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
>>> +                mbox0,
>>> +                mbox1
>>> +            );
>>> +
>>> +            if
>>> !regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()
>>> {
>>> +                return Err(ETIMEDOUT);  
>> 
>> So under which situation do you get to this point
>> (!scrubber_completed) ? Basically I am not sure if ETIMEDOUT is the
>> right error to return here, because boot() already returns ETIMEDOUT
>> by waiting for the halt.
>> 
>> If you still want return ETIMEDOUT here, then it sounds like you're
>> waiting for scrubbing beyond the waiting already done by boot(). If
>> so, then shouldn't you need to use read_poll_timeout() here?
>> 
>> perhaps something like:
>> 
>> read_poll_timeout(
>>     ||
>> Ok(regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()),
>> |val: &bool| *val, Delta::from_millis(10),
>>     Delta::from_secs(5),
>> )?;
>> 
> 
> This is the identical implementation to OpenRM [1]. According to that
> parts of code, I think the scrubber runs in the binary booting process.
> When it signals the firmware booting successfully, the scrubbing should
> be done. Let me change to another errno.
> 
> [1]https://github.com/NVIDIA/open-gpu-kernel-modules/blob/a5bfb10e75a4046c5d991c65f49b5d29151e68cf/src/nvidia/src/kernel/gpu/gsp/arch/ada/kernel_gsp_ad102.c#L49

Sure, it was just misleading in the patch that we’re returning a timeout error, when the error is something else (like scrubber failed). Thanks for correcting it.

 - Joel