lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 22 May 2024 12:08:51 +0100
From: Edward Cree <ecree.xilinx@...il.com>
To: mengkanglai <mengkanglai2@...wei.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Jiri Pirko <jiri@...nulli.us>, Simon Horman <horms@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Lorenzo Bianconi <lorenzo@...nel.org>,
 "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
 open list <linux-kernel@...r.kernel.org>
Cc: "Fengtao (fengtao, Euler)" <fengtao40@...wei.com>,
 "Yanan (Euler)" <yanan@...wei.com>
Subject: Re: cpu performance drop between 4.18 and 5.10 kernel?

On 22/05/2024 08:44, mengkanglai wrote:
> Dear maintainers:
> I updated my VM kernel from 4.18 to 5.10, and found that the CPU SI usage was higher under the 5.10 kernel for the same udp service.
> I captured the flame graph and compared the two versions of kernels. 
> Kernel 5.10 compared to 4.18 napi_complete_done function added gro_normal_list call (ommit 323ebb61e32b4 ("net: use listified RX for handling GRO_NORMAL
> skbs") Introduced), I removed gro_normal_list from napi_complete_done in 5.10 kernel, CPU SI usages was same as 4.18.
> I don't know much about GRO, so I'm not sure if it can be modified in this way, and the consequences of such a modification?

No, you can't just remove that call, else network RX packets will
 be delayed for arbitrarily long times, and potentially leaked if
 the netdev is ifdowned.  The delay may also lead to other bugs
 from code that assumes the RX processing happens within a single
 NAPI cycle.
You could revert the commit, and if that improves performance for
 you then more data would potentially be interesting.
You can also try altering sysctl net.core.gro_normal_batch;
 setting it to 0 (or 1) should prevent any batching and in theory
 give the same performance as reverting 323ebb61e32b4 — if it
 doesn't then that's also a significant datum.

-ed

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ