lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210420030837.GB31773@xsang-OptiPlex-9020>
Date:   Tue, 20 Apr 2021 11:08:37 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [signal]  4bad58ebc8:  will-it-scale.per_thread_ops -3.3% regression



Greeting,

FYI, we noticed a -3.3% regression of will-it-scale.per_thread_ops due to commit:


commit: 4bad58ebc8bc4f20d89cff95417c9b4674769709 ("signal: Allow tasks to cache one sigqueue struct")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core


in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: futex3
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/futex3/will-it-scale/0x5003006

commit: 
  69995ebbb9 ("signal: Hand SIGQUEUE_PREALLOC flag to __sigqueue_alloc()")
  4bad58ebc8 ("signal: Allow tasks to cache one sigqueue struct")

69995ebbb9d37173 4bad58ebc8bc4f20d89cff95417 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.273e+09            -3.3%  1.231e+09        will-it-scale.192.threads
   6630224            -3.3%    6409738        will-it-scale.per_thread_ops
 1.273e+09            -3.3%  1.231e+09        will-it-scale.workload
      1638 ±  3%      -7.8%       1510 ±  5%  sched_debug.cfs_rq:/.runnable_avg.max
    297.83 ± 68%   +1747.6%       5502 ±152%  interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2
    297.83 ± 68%   +1747.6%       5502 ±152%  interrupts.CPU12.33:PCI-MSI.524291-edge.eth0-TxRx-2
      8200           -33.4%       5459 ± 35%  interrupts.CPU27.NMI:Non-maskable_interrupts
      8200           -33.4%       5459 ± 35%  interrupts.CPU27.PMI:Performance_monitoring_interrupts
      8199           -33.4%       5459 ± 35%  interrupts.CPU28.NMI:Non-maskable_interrupts
      8199           -33.4%       5459 ± 35%  interrupts.CPU28.PMI:Performance_monitoring_interrupts
      6148 ± 33%     -11.2%       5459 ± 35%  interrupts.CPU29.NMI:Non-maskable_interrupts
      6148 ± 33%     -11.2%       5459 ± 35%  interrupts.CPU29.PMI:Performance_monitoring_interrupts
      4287 ±  8%     +33.6%       5730 ± 15%  interrupts.CPU49.CAL:Function_call_interrupts
      6356 ± 19%     +49.6%       9509 ± 19%  interrupts.CPU97.CAL:Function_call_interrupts
 9.163e+10            -3.3%  8.857e+10        perf-stat.i.branch-instructions
 3.211e+08            -2.9%  3.118e+08        perf-stat.i.branch-misses
      0.94            +3.2%       0.97        perf-stat.i.cpi
    407730 ±  8%     +37.5%     560565 ±  7%  perf-stat.i.dTLB-load-misses
 1.551e+11            -3.3%  1.499e+11        perf-stat.i.dTLB-loads
    274320            -8.4%     251354 ± 18%  perf-stat.i.dTLB-store-misses
 1.169e+11            -3.3%   1.13e+11        perf-stat.i.dTLB-stores
 5.952e+11            -3.3%  5.754e+11        perf-stat.i.instructions
      1900            -4.9%       1807        perf-stat.i.instructions-per-iTLB-miss
      1.07            -3.2%       1.03        perf-stat.i.ipc
      1893            -3.3%       1830        perf-stat.i.metric.M/sec
      0.93            +3.3%       0.97        perf-stat.overall.cpi
      0.00 ±  8%      +0.0        0.00 ±  7%  perf-stat.overall.dTLB-load-miss-rate%
      1896            -5.1%       1800        perf-stat.overall.instructions-per-iTLB-miss
      1.07            -3.2%       1.04        perf-stat.overall.ipc
 9.131e+10            -3.3%  8.827e+10        perf-stat.ps.branch-instructions
   3.2e+08            -2.9%  3.107e+08        perf-stat.ps.branch-misses
    415959 ±  8%     +40.4%     583928 ±  7%  perf-stat.ps.dTLB-load-misses
 1.545e+11            -3.3%  1.494e+11        perf-stat.ps.dTLB-loads
    274020            -8.4%     250940 ± 18%  perf-stat.ps.dTLB-store-misses
 1.165e+11            -3.3%  1.126e+11        perf-stat.ps.dTLB-stores
 5.932e+11            -3.3%  5.734e+11        perf-stat.ps.instructions
 1.793e+14            -3.3%  1.733e+14        perf-stat.total.instructions
     32.73            -1.0       31.71        perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
      8.37            -0.2        8.20        perf-profile.calltrace.cycles-pp.hash_futex.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      1.52            -0.1        1.38        perf-profile.calltrace.cycles-pp.rcu_nocb_flush_deferred_wakeup.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
      2.27            -0.1        2.17        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      2.17            -0.1        2.08        perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.32            -0.1        1.26        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
      5.45            +0.3        5.71        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      7.55            +0.4        7.98        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
      5.07            +0.5        5.58        perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
     28.26            +0.9       29.19        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     37.41            +1.1       38.50        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     33.56            +1.2       34.78        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     52.14            +1.3       53.40        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     23.03            +1.4       24.44        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     21.10            -0.7       20.44        perf-profile.children.cycles-pp.__entry_text_start
     17.77            -0.5       17.31        perf-profile.children.cycles-pp.syscall_return_via_sysret
      8.48            -0.2        8.28        perf-profile.children.cycles-pp.hash_futex
      1.58            -0.1        1.44        perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
      2.43            -0.1        2.33        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.20            -0.1        2.11        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.42 ±  6%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.tick_sched_handle
      0.42 ±  6%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.update_process_times
      1.34            -0.1        1.29        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.52 ±  2%      -0.0        0.48 ±  2%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.47 ±  2%      -0.0        0.43 ±  2%  perf-profile.children.cycles-pp.tick_sched_timer
      0.23 ±  4%      -0.0        0.20 ±  2%  perf-profile.children.cycles-pp.update_curr
      0.18 ±  4%      -0.0        0.16 ±  3%  perf-profile.children.cycles-pp.perf_prepare_sample
      5.60            +0.3        5.89        perf-profile.children.cycles-pp.get_futex_key
      8.20            +0.4        8.59        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      5.36            +0.5        5.86        perf-profile.children.cycles-pp.exit_to_user_mode_prepare
     23.57            +0.8       24.36        perf-profile.children.cycles-pp.futex_wake
     37.58            +1.1       38.68        perf-profile.children.cycles-pp.do_syscall_64
     52.56            +1.2       53.80        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     33.87            +1.2       35.11        perf-profile.children.cycles-pp.__x64_sys_futex
     28.60            +1.3       29.89        perf-profile.children.cycles-pp.do_futex
     17.64            -0.4       17.20        perf-profile.self.cycles-pp.syscall_return_via_sysret
      9.47            -0.3        9.15 ±  2%  perf-profile.self.cycles-pp.__entry_text_start
      6.88            -0.3        6.61        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      8.18            -0.2        7.98        perf-profile.self.cycles-pp.hash_futex
      1.33            -0.1        1.22        perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
      2.42            -0.1        2.32        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.85            -0.1        1.77        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.88            -0.1        1.81        perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      1.25            -0.0        1.21        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      1.27            -0.0        1.23        perf-profile.self.cycles-pp.do_syscall_64
      1.69            +0.0        1.71        perf-profile.self.cycles-pp.testcase
      5.26            +0.2        5.48        perf-profile.self.cycles-pp.get_futex_key
      9.61            +0.4       10.02        perf-profile.self.cycles-pp.futex_wake
      3.74            +0.6        4.37        perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      5.08            +0.8        5.90        perf-profile.self.cycles-pp.do_futex


                                                                                
                             will-it-scale.per_thread_ops                       
                                                                                
  6.65e+06 +----------------------------------------------------------------+   
           |                                                     +    +.++.+|   
   6.6e+06 |-+                                             +     :          |   
           |           .++.  +.+   +      +                :    :           |   
  6.55e+06 |-+        +    ++   +.+ +.++. ::               ::   :           |   
   6.5e+06 |-.++.+    :                  + :              : :  +            |   
           |+     :+.:                      :       +   +.+ +.+             |   
  6.45e+06 |-+    +  +                      + +.+ .+ +.+                    |   
           |                             O   +O  +                          |   
   6.4e+06 |-+                 OO OOO OO  O O                               |   
  6.35e+06 |-+                                                              |   
           |                                                                |   
   6.3e+06 |-+    OO OO     O                                               |   
           |O OO O      OO O O                                              |   
  6.25e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00046-g4bad58ebc8bc" of type "text/plain" (172883 bytes)

View attachment "job-script" of type "text/plain" (8016 bytes)

View attachment "job.yaml" of type "text/plain" (5272 bytes)

View attachment "reproduce" of type "text/plain" (338 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ