lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <113e81f6-b349-97c0-4cec-d90087e7e13b@nvidia.com>
Date:   Wed, 8 Feb 2023 13:08:54 +0200
From:   Tariq Toukan <tariqt@...dia.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>,
        David Chen <david.chen@...anix.com>,
        Zhang Qiao <zhangqiao22@...wei.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Tariq Toukan <tariqt@...dia.com>,
        Network Development <netdev@...r.kernel.org>,
        Gal Pressman <gal@...dia.com>, Malek Imam <mimam@...dia.com>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>,
        Tariq Toukan <ttoukan.linux@...il.com>
Subject: Bug report: UDP ~20% degradation

Hi all,

Our performance verification team spotted a degradation of up to ~20% in 
UDP performance, for a specific combination of parameters.

Our matrix covers several parameters values, like:
IP version: 4/6
MTU: 1500/9000
Msg size: 64/1452/8952 (only when applicable while avoiding ip 
fragmentation).
Num of streams: 1/8/16/24.
Num of directions: unidir/bidir.

Surprisingly, the issue exists only with this specific combination:
8 streams,
MTU 9000,
Msg size 8952,
both ipv4/6,
bidir.
(in unidir it repros only with ipv4)

The reproduction is consistent on all the different setups we tested with.

Bisect [2] was done between these two points, v5.19 (Good), and v6.0-rc1 
(Bad), with ConnectX-6DX NIC.

c82a69629c53eda5233f13fc11c3c01585ef48a2 is the first bad commit [1].

We couldn't come up with a good explanation how this patch causes this 
issue. We also looked for related changes in the networking/UDP stack, 
but nothing looked suspicious.

Maybe someone here can help with this.
We can provide more details or do further tests/experiments to progress 
with the debug.

Thanks,
Tariq

[1]
commit c82a69629c53eda5233f13fc11c3c01585ef48a2
Author: Vincent Guittot <vincent.guittot@...aro.org>
Date:   Fri Jul 8 17:44:01 2022 +0200

     sched/fair: fix case with reduced capacity CPU

     The capacity of the CPU available for CFS tasks can be reduced 
because of
     other activities running on the latter. In such case, it's worth 
trying to
     move CFS tasks on a CPU with more available capacity.
 
 
 

     The rework of the load balance has filtered the case when the CPU 
is 
 

     classified to be fully busy but its capacity is reduced. 
 
 

 
 
 

     Check if CPU's capacity is reduced while gathering load balance 
statistic 
 

     and classify it group_misfit_task instead of group_fully_busy so we 
can 
 

     try to move the load on another CPU. 
 
 

 
 
 

     Reported-by: David Chen <david.chen@...anix.com> 
 
 

     Reported-by: Zhang Qiao <zhangqiao22@...wei.com> 
 
 

     Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org> 
 
 

     Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org> 
 
 

     Tested-by: David Chen <david.chen@...anix.com> 
 
 

     Tested-by: Zhang Qiao <zhangqiao22@...wei.com> 
 
 

     Link: 
https://lkml.kernel.org/r/20220708154401.21411-1-vincent.guittot@linaro.org 
 
 


[2]

Detailed bisec steps:

+--------------+--------+-----------+-----------+
| Commit       | Status | BW (Gbps) | BW (Gbps) |
|              |        | run1      | run2      |
+--------------+--------+-----------+-----------+
| 526942b8134c | Bad    | ---       | ---       |
+--------------+--------+-----------+-----------+
| 2e7a95156d64 | Bad    | ---       | ---       |
+--------------+--------+-----------+-----------+
| 26c350fe7ae0 | Good   | 279.8     | 281.9     |
+--------------+--------+-----------+-----------+
| 9de1f9c8ca51 | Bad    | 257.243   | ---       |
+--------------+--------+-----------+-----------+
| 892f7237b3ff | Good   | 285       | 300.7     |
+--------------+--------+-----------+-----------+
| 0dd1cabe8a4a | Good   | 305.599   | 290.3     |
+--------------+--------+-----------+-----------+
| dfea84827f7e | Bad    | 250.2     | 258.899   |
+--------------+--------+-----------+-----------+
| 22a39c3d8693 | Bad    | 236.8     | 245.399   |
+--------------+--------+-----------+-----------+
| e2f3e35f1f5a | Good   | 277.599   | 287       |
+--------------+--------+-----------+-----------+
| 401e4963bf45 | Bad    | 250.149   | 248.899   |
+--------------+--------+-----------+-----------+
| 3e8c6c9aac42 | Good   | 299.09    | 294.9     |
+--------------+--------+-----------+-----------+
| 1fcf54deb767 | Good   | 292.719   | 301.299   |
+--------------+--------+-----------+-----------+
| c82a69629c53 | Bad    | 254.7     | 246.1     |
+--------------+--------+-----------+-----------+
| c02d5546ea34 | Good   | 276.4     | 294       |
+--------------+--------+-----------+-----------+

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ