[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0614d2ab-5e0b-4a0b-8347-9bb3339634c0@nvidia.com>
Date: Thu, 18 Dec 2025 20:48:33 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Timur Tabi <ttabi@...dia.com>
Cc: "gary@...yguo.net" <gary@...yguo.net>,
"lossin@...nel.org" <lossin@...nel.org>,
"a.hindborg@...nel.org" <a.hindborg@...nel.org>,
"boqun.feng@...il.com" <boqun.feng@...il.com>,
"ojeda@...nel.org" <ojeda@...nel.org>, "simona@...ll.ch" <simona@...ll.ch>,
"tmgross@...ch.edu" <tmgross@...ch.edu>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"nouveau@...ts.freedesktop.org" <nouveau@...ts.freedesktop.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rust-for-linux@...r.kernel.org" <rust-for-linux@...r.kernel.org>,
"bjorn3_gh@...tonmail.com" <bjorn3_gh@...tonmail.com>,
Eliot Courtney <ecourtney@...dia.com>,
"aliceryhl@...gle.com" <aliceryhl@...gle.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"kwilczynski@...nel.org" <kwilczynski@...nel.org>,
Alexandre Courbot <acourbot@...dia.com>, "dakr@...nel.org"
<dakr@...nel.org>, Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 6/7] gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP
command GSP upon unloading
On 12/18/2025 8:46 PM, Joel Fernandes wrote:
>
>
> On 12/18/2025 6:34 PM, Timur Tabi wrote:
>> On Thu, 2025-12-18 at 22:44 +0000, Joel Fernandes wrote:
>>>> Isn't the real problem that we are polling for a specific message, when all message should be
>>>> handled asynchronously as events, like Nouveau does?
>>>>
>>>> Err(ERANGE) => continue,
>>>>
>>>> This effectively throws out all other messages, including errors and anything else important.
>>>>
>>>
>>> Indeed, for that we need Interrupts. For the rest of the patterns where we need the message
>>> synchronously, we should bound this. Hanging in the driver is unacceptable.
>>
>> It's going to be difficulty to have a running asynchronous message handler in the background *and*
>> poll synchronously for a specific message on occasional. I would say that even in this case, we
>> should handle the message asynchronously. So instead of polling on the message queue, we just wait
>> on a semaphore, with a timeout.
>
> I think we don't strictly need a semaphore for synchronous polling - the wait is
> expected to be short AFAIK and if not we should just error out. What we need is
> a registration mechanism that registers different event types and their
> handlers, and if the message received is not an expected one, we simply call the
> event handler registered while continuing to poll for the message we are
> expecting until it is received: See how Nouveau does it in r535_gsp_msg_recv().
> Anyway, the wait should be expected to be short and if not, we'd break out of
> the loop { }.
>
> Interestingly, Nouveau inserts 2 micro second sleeps while polling AFAICS. Where
> as OpenRM simply spins without sleeps. I would even say that sleeping in the
> loop is risky due to the dependency on atomic context, so we'd have to be
> careful there (I am guessing all our usecases for these loops are non-atomic
> context?).
>
> We still need the interrupt handling for cases where we don't need synchronous
> polling. During then, we will directly call the event handlers from
> IRQ->workqueue path. The event handler registration/calling code in both cases
> should be the same.
>
> So in the loop { }, nova needs something like this:
>
> Err(ERANGE) => {
> // Dispatch to notification
> dispatch_async_message(msg); // Same ones called by Async handling.
> continue;
> }
Btw, I can work on this tomorrow and send out some patches.
Powered by blists - more mailing lists