lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210428071653.GC13086@xsang-OptiPlex-9020>
Date:   Wed, 28 Apr 2021 15:16:53 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [entry]  47b8ff194c:  will-it-scale.per_process_ops -3.0% regression



Greeting,

FYI, we noticed a -3.0% regression of will-it-scale.per_process_ops due to commit:


commit: 47b8ff194c1fd73d58dc339b597d466fe48c8958 ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: futex3
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/futex3/will-it-scale/0x5003006

commit: 
  f8bb5cae96 ("rcu/nocb: Trigger self-IPI on late deferred wake up before user resume")
  47b8ff194c ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")

f8bb5cae9616224a 47b8ff194c1fd73d58dc339b597 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  1.25e+09            -3.0%  1.212e+09        will-it-scale.192.processes
   6508984            -3.0%    6314032        will-it-scale.per_process_ops
  1.25e+09            -3.0%  1.212e+09        will-it-scale.workload
     68.00            -1.5%      67.00        vmstat.cpu.sy
     30.00            +3.3%      31.00        vmstat.cpu.us
 8.622e+10            +1.2%  8.728e+10        perf-stat.i.branch-instructions
      0.38            -0.0        0.36        perf-stat.i.branch-miss-rate%
 3.206e+08            -3.7%  3.088e+08        perf-stat.i.branch-misses
      0.98            +1.1%       0.99        perf-stat.i.cpi
    263518            +2.3%     269550        perf-stat.i.dTLB-store-misses
 1.135e+11            -1.9%  1.113e+11        perf-stat.i.dTLB-stores
 3.257e+08            -4.8%    3.1e+08        perf-stat.i.iTLB-load-misses
 5.718e+11            -1.1%  5.656e+11        perf-stat.i.instructions
      1758            +3.9%       1827        perf-stat.i.instructions-per-iTLB-miss
      1.03            -1.1%       1.01        perf-stat.i.ipc
      0.37            -0.0        0.35        perf-stat.overall.branch-miss-rate%
      0.97            +1.1%       0.99        perf-stat.overall.cpi
      0.00            +0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
      1755            +3.9%       1824        perf-stat.overall.instructions-per-iTLB-miss
      1.03            -1.1%       1.02        perf-stat.overall.ipc
    138016            +2.0%     140712        perf-stat.overall.path-length
 8.592e+10            +1.2%  8.698e+10        perf-stat.ps.branch-instructions
 3.195e+08            -3.7%  3.078e+08        perf-stat.ps.branch-misses
    262973            +2.3%     269022        perf-stat.ps.dTLB-store-misses
 1.131e+11            -1.9%  1.109e+11        perf-stat.ps.dTLB-stores
 3.246e+08            -4.8%   3.09e+08        perf-stat.ps.iTLB-load-misses
 5.698e+11            -1.1%  5.637e+11        perf-stat.ps.instructions
 1.725e+14            -1.1%  1.706e+14        perf-stat.total.instructions
     32.11            -1.0       31.08        perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
     36.13            -0.3       35.81        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     39.88            -0.3       39.58        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     30.75            -0.2       30.57        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.22            -0.1        2.16        perf-profile.calltrace.cycles-pp.testcase
      2.15            -0.1        2.09        perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.21            -0.0        2.17        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      6.22            +0.1        6.32        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      1.17            +0.1        1.31        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
     52.27            +1.4       53.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
      3.53            +1.5        5.00        perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
      0.00            +1.5        1.55        perf-profile.calltrace.cycles-pp.rcu_nocb_flush_deferred_wakeup.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
      5.58            +1.9        7.47        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
     20.72            -0.6       20.09        perf-profile.children.cycles-pp.__entry_text_start
     17.34            -0.5       16.87        perf-profile.children.cycles-pp.syscall_return_via_sysret
     40.05            -0.3       39.75        perf-profile.children.cycles-pp.do_syscall_64
     36.43            -0.2       36.20        perf-profile.children.cycles-pp.__x64_sys_futex
     31.18            -0.2       30.98        perf-profile.children.cycles-pp.do_futex
      2.36            -0.1        2.30        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.41            -0.1        2.34        perf-profile.children.cycles-pp.testcase
      2.19            -0.1        2.12        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
     97.88            +0.1       97.94        perf-profile.children.cycles-pp.syscall
      6.46            +0.1        6.58        perf-profile.children.cycles-pp.get_futex_key
      1.19            +0.1        1.33        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
     52.69            +1.4       54.05        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.00            +1.6        1.60        perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
      3.58            +1.7        5.33        perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      6.16            +1.9        8.07        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     15.88            -0.5       15.33        perf-profile.self.cycles-pp.syscall
     17.22            -0.5       16.75        perf-profile.self.cycles-pp.syscall_return_via_sysret
      6.00            -0.3        5.70        perf-profile.self.cycles-pp.do_futex
      9.33            -0.2        9.09        perf-profile.self.cycles-pp.__entry_text_start
      6.58            -0.2        6.35        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      8.29            -0.1        8.22        perf-profile.self.cycles-pp.hash_futex
      2.36            -0.1        2.29        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.86            -0.1        1.80        perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      1.96            -0.0        1.91        perf-profile.self.cycles-pp.testcase
      1.14            +0.1        1.27        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      6.04            +0.2        6.19        perf-profile.self.cycles-pp.get_futex_key
      3.29            +0.4        3.65        perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.00            +1.4        1.39        perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup


                                                                                
                              will-it-scale.192.processes                       
                                                                                
  1.4e+09 +-----------------------------------------------------------------+   
          |                                          .+.+  ++.++.+.+      +.|   
  1.2e+09 |.++.++.+O +.++.++.++.+.++.++.++.+.++.++.++   :  :        +.++.+  |   
          |       :  :                                  :  :                |   
    1e+09 |-+     :  :                                  :  :                |   
          |       : :                                   : :                 |   
    8e+08 |-+     : :                                   : :                 |   
          |        ::                                   : :                 |   
    6e+08 |-+      ::                                    ::                 |   
          |        :                                     ::                 |   
    4e+08 |-+      :                                     ::                 |   
          |        +                                     :                  |   
    2e+08 |-+                                            :                  |   
          |                                              :                  |   
        0 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  7e+06 +-------------------------------------------------------------------+   
        |.++.++.+ O+.+.++.++.+.++.+.++.++.+.++.++.+.++.+  +.++.+.++.+.++.++.|   
  6e+06 |-OO OO :  :                                   :  :                 |   
        |       :  :                                   :  :                 |   
  5e+06 |-+      : :                                   :  :                 |   
        |        : :                                    : :                 |   
  4e+06 |-+      : :                                    : :                 |   
        |        ::                                     : :                 |   
  3e+06 |-+      ::                                     ::                  |   
        |         :                                     ::                  |   
  2e+06 |-+       :                                     ::                  |   
        |         +                                      :                  |   
  1e+06 |-+                                              :                  |   
        |                                                :                  |   
      0 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  1.4e+09 +-----------------------------------------------------------------+   
          |                                          .+.+  ++.++.+.+      +.|   
  1.2e+09 |.++.++.+O +.++.++.++.+.++.++.++.+.++.++.++   :  :        +.++.+  |   
          |       :  :                                  :  :                |   
    1e+09 |-+     :  :                                  :  :                |   
          |       : :                                   : :                 |   
    8e+08 |-+     : :                                   : :                 |   
          |        ::                                   : :                 |   
    6e+08 |-+      ::                                    ::                 |   
          |        :                                     ::                 |   
    4e+08 |-+      :                                     ::                 |   
          |        +                                     :                  |   
    2e+08 |-+                                            :                  |   
          |                                              :                  |   
        0 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.11.0-00050-g47b8ff194c1f" of type "text/plain" (172429 bytes)

View attachment "job-script" of type "text/plain" (7692 bytes)

View attachment "job.yaml" of type "text/plain" (5240 bytes)

View attachment "reproduce" of type "text/plain" (339 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ