lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <vbfmunui7dm.fsf@mellanox.com> Date: Mon, 21 Jan 2019 11:24:44 +0000 From: Vlad Buslov <vladbu@...lanox.com> To: Eric Dumazet <edumazet@...gle.com> CC: Linux Kernel Network Developers <netdev@...r.kernel.org>, Yevgeny Kliteynik <kliteyn@...lanox.com>, Yossef Efraim <yossefe@...lanox.com>, Maor Gottlieb <maorg@...lanox.com> Subject: tc filter insertion rate degradation Hi Eric, I've been investigating significant tc filter insertion rate degradation and it seems it is caused by your commit 001c96db0181 ("net: align gnet_stats_basic_cpu struct"). With this commit insertion rate is reduced from ~65k rules/sec to ~43k rules/sec when inserting 1m rules from file in tc batch mode on my machine. Tc perf profile indicates that pcpu allocator now consumes 2x CPU: 1) Before: Samples: 63K of event 'cycles:ppp', Event count (approx.): 48796480071 Children Self Co Shared Object Symbol + 21.19% 3.38% tc [kernel.vmlinux] [k] pcpu_alloc + 3.45% 0.25% tc [kernel.vmlinux] [k] pcpu_alloc_area 2) After: Samples1: 92K of event 'cycles:ppp', Event count (approx.): 71446806550 Children Self Co Shared Object Symbol + 44.67% 3.99% tc [kernel.vmlinux] [k] pcpu_alloc + 19.25% 0.22% tc [kernel.vmlinux] [k] pcpu_alloc_area It seems that it takes much more work for pcpu allocator to perform allocation with new stricter alignment requirements. Not sure if it is expected behavior or not in this case. Regards, Vlad
Powered by blists - more mailing lists