linux-kernel - Re: [PATCH 6/7] gpu: nova-core: send UNLOADING_GUEST

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0614d2ab-5e0b-4a0b-8347-9bb3339634c0@nvidia.com>
Date: Thu, 18 Dec 2025 20:48:33 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Timur Tabi <ttabi@...dia.com>
Cc: "gary@...yguo.net" <gary@...yguo.net>,
 "lossin@...nel.org" <lossin@...nel.org>,
 "a.hindborg@...nel.org" <a.hindborg@...nel.org>,
 "boqun.feng@...il.com" <boqun.feng@...il.com>,
 "ojeda@...nel.org" <ojeda@...nel.org>, "simona@...ll.ch" <simona@...ll.ch>,
 "tmgross@...ch.edu" <tmgross@...ch.edu>,
 "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
 "nouveau@...ts.freedesktop.org" <nouveau@...ts.freedesktop.org>,
 "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "rust-for-linux@...r.kernel.org" <rust-for-linux@...r.kernel.org>,
 "bjorn3_gh@...tonmail.com" <bjorn3_gh@...tonmail.com>,
 Eliot Courtney <ecourtney@...dia.com>,
 "aliceryhl@...gle.com" <aliceryhl@...gle.com>,
 "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
 "kwilczynski@...nel.org" <kwilczynski@...nel.org>,
 Alexandre Courbot <acourbot@...dia.com>, "dakr@...nel.org"
 <dakr@...nel.org>, Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 6/7] gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP
 command GSP upon unloading



On 12/18/2025 8:46 PM, Joel Fernandes wrote:
> 
> 
> On 12/18/2025 6:34 PM, Timur Tabi wrote:
>> On Thu, 2025-12-18 at 22:44 +0000, Joel Fernandes wrote:
>>>> Isn't the real problem that we are polling for a specific message, when all message should be
>>>> handled asynchronously as events, like Nouveau does?
>>>>
>>>>           Err(ERANGE) => continue,
>>>>
>>>> This effectively throws out all other messages, including errors and anything else important.
>>>>
>>>
>>> Indeed, for that we need Interrupts. For the rest of the patterns where we need the message
>>> synchronously, we should bound this. Hanging in the driver is unacceptable.
>>
>> It's going to be difficulty to have a running asynchronous message handler in the background *and*
>> poll synchronously for a specific message on occasional.  I would say that even in this case, we
>> should handle the message asynchronously.  So instead of polling on the message queue, we just wait
>> on a semaphore, with a timeout.
> 
> I think we don't strictly need a semaphore for synchronous polling - the wait is
> expected to be short AFAIK and if not we should just error out. What we need is
> a registration mechanism that registers different event types and their
> handlers, and if the message received is not an expected one, we simply call the
> event handler registered while continuing to poll for the message we are
> expecting until it is received: See how Nouveau does it in r535_gsp_msg_recv().
> Anyway, the wait should be expected to be short and if not, we'd break out of
> the loop { }.
> 
> Interestingly, Nouveau inserts 2 micro second sleeps while polling AFAICS. Where
> as OpenRM simply spins without sleeps. I would even say that sleeping in the
> loop is risky due to the dependency on atomic context, so we'd have to be
> careful there (I am guessing all our usecases for these loops are non-atomic
> context?).
> 
> We still need the interrupt handling for cases where we don't need synchronous
> polling. During then, we will directly call the event handlers from
> IRQ->workqueue path. The event handler registration/calling code in both cases
> should be the same.
> 
> So in the loop { }, nova needs something like this:
> 
>   Err(ERANGE) => {
>       // Dispatch to notification
>       dispatch_async_message(msg); // Same ones called by Async handling.
>       continue;
>   }
Btw, I can work on this tomorrow and send out some patches.