lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com>
Date: Fri, 26 May 2023 17:00:22 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>, 
	Linus Torvalds <torvalds@...ux-foundation.org>
Cc: netdev <netdev@...r.kernel.org>
Subject: x86 copy performance regression

Hi Linus

While testing unrelated patches using upstream net-next kernels,
I found a big regression in sendmsg()/recvmsg() caused by a series of yours.

Tested platforms :

Intel(R) Xeon(R) Gold 6268L CPU @ 2.80GHz

We can see rep_movs_alternative() using more cycles in kernel profiles
than the previous variant (copy_user_enhanced_fast_string, which was
simply using "rep  movsb"), and we can not reach line rate (as we
could before the series)


Patch series:

commit a5624566431de76b17862383d9ae254d9606cba9
Merge: 487c20b016dc48230367a7be017f40313e53e3bd
034ff37d34071ff3f48755f728cd229e42a4f15d
Author: Linus Torvalds <torvalds@...ux-foundation.org>
Date:   Mon Apr 24 10:39:27 2023 -0700

    Merge branch 'x86-rep-insns': x86 user copy clarifications

    Merge my x86 user copy updates branch.

IMO this patch seems to think tcp sendmsg() is using small areas.
(tcp_sendmsg() usually copy 32KB at a time, if order-3 pages
allocations are possible)

commit adfcf4231b8cbc2d9c1e7bfaa965b907e60639eb
Author: Linus Torvalds <torvalds@...ux-foundation.org>
Date:   Sat Apr 15 13:14:59 2023 -0700

    x86: don't use REP_GOOD or ERMS for user memory copies

    The modern target to use is FSRM (Fast Short REP MOVS), and the other
    cases should only be used for bigger areas (ie mainly things like page
    clearing).

    Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>



The issue is that (some of) our platforms do have ERMS but not FSRM

Test run on 6.3 (single TCP flow, sending 32 MB of payload to a
zerocopy receiver to make sure the receiver is not the bottleneck).
100Gbit link speed.

# perf stat taskset 02 tcp_mmap -H 2002:a05:6608:295::

 Performance counter stats for 'taskset 02 ./tcp_mmap -H 2002:a05:6608:295::':

          2,815.79 msec task-clock                       #    0.936
CPUs utilized
             2,370      context-switches                 #  841.682
/sec
                 1      cpu-migrations                   #    0.355
/sec
               127      page-faults                      #   45.103
/sec
    10,106,383,352      cycles                           #    3.589
GHz
     6,936,487,168      instructions                     #    0.69
insn per cycle
     1,206,325,691      branches                         #  428.414
M/sec
        10,327,112      branch-misses                    #    0.86% of
all branches

       3.007810265 seconds time elapsed

       0.005158000 seconds user
       2.406125000 seconds sys


Same test from linux-6.4-rc1

# perf stat taskset 02 tcp_mmap -H 2002:a05:6608:295::

 Performance counter stats for 'taskset 02 ./tcp_mmap -H 2002:a05:6608:295::':

          4,039.73 msec task-clock                       #    1.000
CPUs utilized
                12      context-switches                 #    2.970
/sec
                 1      cpu-migrations                   #    0.248
/sec
               130      page-faults                      #   32.180
/sec
    14,639,828,754      cycles                           #    3.624
GHz
    19,443,379,653      instructions                     #    1.33
insn per cycle
     1,931,003,961      branches                         #  478.003
M/sec
        12,349,476      branch-misses                    #    0.64% of
all branches

       4.040825111 seconds time elapsed

       0.012496000 seconds user
       3.560336000 seconds sys

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ