lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 12 Feb 2023 13:50:29 +0200 From: Tariq Toukan <ttoukan.linux@...il.com> To: Vincent Guittot <vincent.guittot@...aro.org>, Tariq Toukan <tariqt@...dia.com> Cc: David Chen <david.chen@...anix.com>, Zhang Qiao <zhangqiao22@...wei.com>, "Peter Zijlstra (Intel)" <peterz@...radead.org>, Willem de Bruijn <willemdebruijn.kernel@...il.com>, Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Valentin Schneider <vschneid@...hat.com>, linux-kernel@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Saeed Mahameed <saeedm@...dia.com>, Network Development <netdev@...r.kernel.org>, Gal Pressman <gal@...dia.com>, Malek Imam <mimam@...dia.com>, Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>, David Ahern <dsahern@...nel.org>, Talat Batheesh <talatb@...dia.com> Subject: Re: Bug report: UDP ~20% degradation On 08/02/2023 16:12, Vincent Guittot wrote: > Hi Tariq, > > On Wed, 8 Feb 2023 at 12:09, Tariq Toukan <tariqt@...dia.com> wrote: >> >> Hi all, >> >> Our performance verification team spotted a degradation of up to ~20% in >> UDP performance, for a specific combination of parameters. >> >> Our matrix covers several parameters values, like: >> IP version: 4/6 >> MTU: 1500/9000 >> Msg size: 64/1452/8952 (only when applicable while avoiding ip >> fragmentation). >> Num of streams: 1/8/16/24. >> Num of directions: unidir/bidir. >> >> Surprisingly, the issue exists only with this specific combination: >> 8 streams, >> MTU 9000, >> Msg size 8952, >> both ipv4/6, >> bidir. >> (in unidir it repros only with ipv4) >> >> The reproduction is consistent on all the different setups we tested with. >> >> Bisect [2] was done between these two points, v5.19 (Good), and v6.0-rc1 >> (Bad), with ConnectX-6DX NIC. >> >> c82a69629c53eda5233f13fc11c3c01585ef48a2 is the first bad commit [1]. >> >> We couldn't come up with a good explanation how this patch causes this >> issue. We also looked for related changes in the networking/UDP stack, >> but nothing looked suspicious. >> >> Maybe someone here can help with this. >> We can provide more details or do further tests/experiments to progress >> with the debug. > > Could you share more details about your system and the cpu topology ? > output for 'lscpu': Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 40 bits physical, 57 bits virtual Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Vendor ID: GenuineIntel BIOS Vendor ID: QEMU Model name: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz BIOS Model name: pc-q35-5.0 CPU family: 6 Model: 106 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 24 Stepping: 6 BogoMIPS: 4589.21 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid md_clear arch_capabilities Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 768 KiB (24 instances) L1i cache: 768 KiB (24 instances) L2 cache: 96 MiB (24 instances) L3 cache: 384 MiB (24 instances) NUMA node(s): 1 NUMA node0 CPU(s): 0-23 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Vulnerable: eIBRS with unprivileged eBPF Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected > The commit c82a69629c53 migrates a task on an idle cpu when the task > is the only one running on local cpu but the time spent by this local > cpu under interrupt or RT context becomes significant (10%-17%) > I can imagine that 16/24 stream overload your system so load_balance > doesn't end up in this case and the cpus are busy with several > threads. On the other hand, 1 stream is small enough to keep your > system lightly loaded but 8 streams make your system significantly > loaded to trigger the reduced capacity case but still not overloaded. > I see. Makes sense. 1. How do you check this theory? Any suggested tests/experiments? 2. How do you suggest this degradation should be fixed? Thanks, Tariq
Powered by blists - more mailing lists