[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250417082156.5eac67e8@kernel.org>
Date: Thu, 17 Apr 2025 08:21:56 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Shannon Nelson <shannon.nelson@....com>
Cc: <andrew+netdev@...n.ch>, <brett.creeley@....com>, <davem@...emloft.net>,
<edumazet@...gle.com>, <pabeni@...hat.com>,
<michal.swiatkowski@...ux.intel.com>, <horms@...nel.org>,
<linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v3 net 4/4] pds_core: make wait_context part of q_info
On Tue, 15 Apr 2025 16:29:31 -0700 Shannon Nelson wrote:
> Make the wait_context a full part of the q_info struct rather
> than a stack variable that goes away after pdsc_adminq_post()
> is done so that the context is still available after the wait
> loop has given up.
>
> There was a case where a slow development firmware caused
> the adminq request to time out, but then later the FW finally
> finished the request and sent the interrupt. The handler tried
> to complete_all() the completion context that had been created
> on the stack in pdsc_adminq_post() but no longer existed.
> This caused bad pointer usage, kernel crashes, and much wailing
> and gnashing of teeth.
The patch will certainly redirect the access from the stack.
But since you're already processing the completions under a spin
lock, is it not possible to safely invalidate the completion
under the same lock on timeout?
Perhaps not I haven't looked very closely.
> + wc = &pdsc->adminqcq.q.info[index].wc;
> + wc->wait_completion = COMPLETION_INITIALIZER_ONSTACK(wc->wait_completion);
_ONSTACK you say? I don't think it's on the stack any more.
Powered by blists - more mailing lists