lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4e26da4-cb09-7537-60ff-fd00ec4c49d6@gmail.com>
Date: Thu, 15 Jun 2023 01:36:17 +0100
From: Edward Cree <ecree.xilinx@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, Martin Habets <habetsm.xilinx@...il.com>
Cc: Íñigo Huguet <ihuguet@...hat.com>,
 davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
 netdev@...r.kernel.org, linux-net-drivers@....com, Fei Liu <feliu@...hat.com>
Subject: Re: [PATCH net] sfc: use budget for TX completions

On 14/06/2023 18:27, Jakub Kicinski wrote:
> The documentation is pretty recent. I haven't seen this lockup once 
> in production or testing. Do multiple queues complete on the same CPU
> for SFC or something weird like that?

I think the key question here is can one CPU be using a TXQ to send
 while another CPU is in a NAPI poll on the same channel and thus
 trying to clean the EVQ that the TXQ is using.  If so the NAPI poll
 could last forever; if not then it shouldn't ever have more than 8k
 (or whatever the TX ring size is set to) events to process.
And even ignoring affinity of the core TXQs, at the very least XDP
 TXQs can serve different CPUs to the one on which their EVQ (and
 hence NAPI poll) lives, which means they can keep filling the EVQ
 as fast as the NAPI poll empties it, and thus keep ev_process
 looping forever.
In principle this can also happen with other kinds of events, e.g.
 if the MC goes crazy and generates infinite MCDI-event spam then
 NAPI poll will spin on that CPU forever eating the events.  So
 maybe this limit needs to be broader than just TX events?  A hard
 cap on the number of events (regardless of type) that can be
 consumed in a single efx_ef10_ev_process() invocation, perhaps?

-ed

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ