[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4e26da4-cb09-7537-60ff-fd00ec4c49d6@gmail.com>
Date: Thu, 15 Jun 2023 01:36:17 +0100
From: Edward Cree <ecree.xilinx@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, Martin Habets <habetsm.xilinx@...il.com>
Cc: Íñigo Huguet <ihuguet@...hat.com>,
davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
netdev@...r.kernel.org, linux-net-drivers@....com, Fei Liu <feliu@...hat.com>
Subject: Re: [PATCH net] sfc: use budget for TX completions
On 14/06/2023 18:27, Jakub Kicinski wrote:
> The documentation is pretty recent. I haven't seen this lockup once
> in production or testing. Do multiple queues complete on the same CPU
> for SFC or something weird like that?
I think the key question here is can one CPU be using a TXQ to send
while another CPU is in a NAPI poll on the same channel and thus
trying to clean the EVQ that the TXQ is using. If so the NAPI poll
could last forever; if not then it shouldn't ever have more than 8k
(or whatever the TX ring size is set to) events to process.
And even ignoring affinity of the core TXQs, at the very least XDP
TXQs can serve different CPUs to the one on which their EVQ (and
hence NAPI poll) lives, which means they can keep filling the EVQ
as fast as the NAPI poll empties it, and thus keep ev_process
looping forever.
In principle this can also happen with other kinds of events, e.g.
if the MC goes crazy and generates infinite MCDI-event spam then
NAPI poll will spin on that CPU forever eating the events. So
maybe this limit needs to be broader than just TX events? A hard
cap on the number of events (regardless of type) that can be
consumed in a single efx_ef10_ev_process() invocation, perhaps?
-ed
Powered by blists - more mailing lists