[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f95150d-db0d-d9e5-4eff-2196d5e8de05@gmail.com>
Date: Sun, 12 Feb 2023 13:50:29 +0200
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Tariq Toukan <tariqt@...dia.com>
Cc: David Chen <david.chen@...anix.com>,
Zhang Qiao <zhangqiao22@...wei.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Saeed Mahameed <saeedm@...dia.com>,
Network Development <netdev@...r.kernel.org>,
Gal Pressman <gal@...dia.com>, Malek Imam <mimam@...dia.com>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Talat Batheesh <talatb@...dia.com>
Subject: Re: Bug report: UDP ~20% degradation
On 08/02/2023 16:12, Vincent Guittot wrote:
> Hi Tariq,
>
> On Wed, 8 Feb 2023 at 12:09, Tariq Toukan <tariqt@...dia.com> wrote:
>>
>> Hi all,
>>
>> Our performance verification team spotted a degradation of up to ~20% in
>> UDP performance, for a specific combination of parameters.
>>
>> Our matrix covers several parameters values, like:
>> IP version: 4/6
>> MTU: 1500/9000
>> Msg size: 64/1452/8952 (only when applicable while avoiding ip
>> fragmentation).
>> Num of streams: 1/8/16/24.
>> Num of directions: unidir/bidir.
>>
>> Surprisingly, the issue exists only with this specific combination:
>> 8 streams,
>> MTU 9000,
>> Msg size 8952,
>> both ipv4/6,
>> bidir.
>> (in unidir it repros only with ipv4)
>>
>> The reproduction is consistent on all the different setups we tested with.
>>
>> Bisect [2] was done between these two points, v5.19 (Good), and v6.0-rc1
>> (Bad), with ConnectX-6DX NIC.
>>
>> c82a69629c53eda5233f13fc11c3c01585ef48a2 is the first bad commit [1].
>>
>> We couldn't come up with a good explanation how this patch causes this
>> issue. We also looked for related changes in the networking/UDP stack,
>> but nothing looked suspicious.
>>
>> Maybe someone here can help with this.
>> We can provide more details or do further tests/experiments to progress
>> with the debug.
>
> Could you share more details about your system and the cpu topology ?
>
output for 'lscpu':
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: GenuineIntel
BIOS Vendor ID: QEMU
Model name: Intel(R) Xeon(R) Platinum 8380 CPU @
2.30GHz
BIOS Model name: pc-q35-5.0
CPU family: 6
Model: 106
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 24
Stepping: 6
BogoMIPS: 4589.21
Flags: fpu vme de pse tsc msr pae mce cx8 apic
sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology
cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd
ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid
ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f
avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni
avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat
avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni
avx512_bitalg avx512_vpopcntdq rdpid md_clear arch_capabilities
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 768 KiB (24 instances)
L1i cache: 768 KiB (24 instances)
L2 cache: 96 MiB (24 instances)
L3 cache: 384 MiB (24 instances)
NUMA node(s): 1
NUMA node0 CPU(s): 0-23
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers
attempted, no microcode; SMT Host state unknown
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass
disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers
and __user pointer sanitization
Vulnerability Spectre v2: Vulnerable: eIBRS with unprivileged eBPF
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
> The commit c82a69629c53 migrates a task on an idle cpu when the task
> is the only one running on local cpu but the time spent by this local
> cpu under interrupt or RT context becomes significant (10%-17%)
> I can imagine that 16/24 stream overload your system so load_balance
> doesn't end up in this case and the cpus are busy with several
> threads. On the other hand, 1 stream is small enough to keep your
> system lightly loaded but 8 streams make your system significantly
> loaded to trigger the reduced capacity case but still not overloaded.
>
I see. Makes sense.
1. How do you check this theory? Any suggested tests/experiments?
2. How do you suggest this degradation should be fixed?
Thanks,
Tariq
Powered by blists - more mailing lists