lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210107134723.GA28532@xsang-OptiPlex-9020>
Date:   Thu, 7 Jan 2021 21:47:23 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Al Viro <viro@...iv.linux.org.uk>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [x86]  d55564cfc2:  will-it-scale.per_thread_ops -5.8% regression


Greeting,

FYI, we noticed a -5.8% regression of will-it-scale.per_thread_ops due to commit:


commit: d55564cfc222326e944893eff0c4118353e349ec ("x86: Make __put_user() generate an out-of-line call")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
with following parameters:

	nr_task: 50%
	mode: thread
	test: poll2
	cpufreq_governor: performance
	ucode: 0x42e

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -6.2% regression             |
| test machine     | 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory     |
| test parameters  | cpufreq_governor=performance                                              |
|                  | mode=process                                                              |
|                  | nr_task=100%                                                              |
|                  | test=poll2                                                                |
|                  | ucode=0x42e                                                               |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -6.8% regression             |
| test machine     | 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory               |
| test parameters  | cpufreq_governor=performance                                              |
|                  | mode=process                                                              |
|                  | nr_task=100%                                                              |
|                  | test=poll2                                                                |
|                  | ucode=0x5002f01                                                           |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -7.3% regression             |
| test machine     | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters  | cpufreq_governor=performance                                              |
|                  | mode=process                                                              |
|                  | nr_task=100%                                                              |
|                  | test=poll2                                                                |
|                  | ucode=0x16                                                                |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -3.6% regression              |
| test machine     | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory    |
| test parameters  | cpufreq_governor=performance                                              |
|                  | mode=thread                                                               |
|                  | nr_task=16                                                                |
|                  | test=poll2                                                                |
|                  | ucode=0x16                                                                |
+------------------+---------------------------------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e

commit: 
  ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
  d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")

ea6f043fc9847e67 d55564cfc222326e944893eff0c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   6600273            -5.8%    6218737        will-it-scale.24.threads
    275010            -5.8%     259113        will-it-scale.per_thread_ops
   6600273            -5.8%    6218737        will-it-scale.workload
     11069 ±105%    +196.1%      32775 ± 35%  numa-numastat.node1.other_node
      0.01 ±  8%     +21.4%       0.01 ±  6%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
      0.00 ± 23%     +50.0%       0.00 ± 11%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
     24562 ±  4%     +10.3%      27098 ±  2%  slabinfo.filp.active_objs
     25333 ±  4%     +10.0%      27863        slabinfo.filp.num_objs
     16632 ±  2%      -2.9%      16151        proc-vmstat.nr_active_anon
     19941            -2.4%      19466        proc-vmstat.nr_shmem
     16632 ±  2%      -2.9%      16151        proc-vmstat.nr_zone_active_anon
      7246 ± 87%    +333.9%      31446 ± 49%  softirqs.CPU25.SCHED
     19452 ±  6%     -28.5%      13915 ± 17%  softirqs.CPU40.RCU
      4067 ± 14%    +257.3%      14533 ± 99%  softirqs.CPU44.SCHED
     19591 ±  7%     -21.7%      15339 ± 25%  softirqs.CPU46.RCU
      0.00            +1.0        0.98 ±  3%  perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.07 ±  5%      +0.0        0.09 ±  9%  perf-profile.children.cycles-pp.vprintk_emit
      0.07 ±  5%      +0.0        0.09 ±  9%  perf-profile.children.cycles-pp.console_unlock
      0.07            +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.serial8250_console_write
      0.07 ±  6%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.uart_console_write
      0.53 ±  5%      +0.1        0.59 ±  2%  perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
      0.00            +1.8        1.77 ±  3%  perf-profile.children.cycles-pp.__put_user_nocheck_2
      0.00            +1.6        1.64 ±  3%  perf-profile.self.cycles-pp.__put_user_nocheck_2
     11.79 ±  8%      +2.4       14.22 ±  2%  perf-profile.self.cycles-pp.do_sys_poll
 2.349e+10            +4.2%  2.449e+10        perf-stat.i.branch-instructions
      0.21            -0.0        0.19        perf-stat.i.branch-miss-rate%
  45979592            -6.1%   43181339        perf-stat.i.branch-misses
  2.36e+10            -2.4%  2.304e+10        perf-stat.i.dTLB-loads
      0.10 ±  4%      -0.0        0.09        perf-stat.i.dTLB-store-miss-rate%
  14580547 ±  4%      -8.3%   13364460        perf-stat.i.dTLB-store-misses
   7364953            -5.1%    6985875        perf-stat.i.iTLB-load-misses
    346056 ±  3%      -8.7%     315837        perf-stat.i.iTLB-loads
 9.903e+10            -1.1%  9.791e+10        perf-stat.i.instructions
     13434            +4.3%      14007        perf-stat.i.instructions-per-iTLB-miss
      0.20            -0.0        0.18        perf-stat.overall.branch-miss-rate%
      0.10 ±  4%      -0.0        0.09        perf-stat.overall.dTLB-store-miss-rate%
     13447            +4.2%      14016        perf-stat.overall.instructions-per-iTLB-miss
   4517015            +5.0%    4744020        perf-stat.overall.path-length
 2.341e+10            +4.2%   2.44e+10        perf-stat.ps.branch-instructions
  45857713            -6.1%   43060109        perf-stat.ps.branch-misses
 2.352e+10            -2.4%  2.296e+10        perf-stat.ps.dTLB-loads
  14530174 ±  4%      -8.3%   13319056        perf-stat.ps.dTLB-store-misses
   7339560            -5.1%    6961988        perf-stat.ps.iTLB-load-misses
    344856 ±  3%      -8.7%     314759        perf-stat.ps.iTLB-loads
 9.869e+10            -1.1%  9.758e+10        perf-stat.ps.instructions
      1830 ± 19%     -36.4%       1163 ± 35%  interrupts.CPU0.CAL:Function_call_interrupts
    131.00 ±172%    +331.9%     565.75 ± 57%  interrupts.CPU1.TLB:TLB_shootdowns
      3444 ± 82%     +72.3%       5935 ± 41%  interrupts.CPU10.NMI:Non-maskable_interrupts
      3444 ± 82%     +72.3%       5935 ± 41%  interrupts.CPU10.PMI:Performance_monitoring_interrupts
      6463 ± 29%     -40.4%       3850 ± 14%  interrupts.CPU17.NMI:Non-maskable_interrupts
      6463 ± 29%     -40.4%       3850 ± 14%  interrupts.CPU17.PMI:Performance_monitoring_interrupts
      1268 ± 20%     +53.2%       1942 ± 22%  interrupts.CPU2.CAL:Function_call_interrupts
      1242 ± 51%     +90.4%       2365 ± 52%  interrupts.CPU22.CAL:Function_call_interrupts
     27.50 ± 37%    +206.4%      84.25 ± 73%  interrupts.CPU22.RES:Rescheduling_interrupts
      1439 ± 14%     -29.1%       1019 ± 26%  interrupts.CPU25.CAL:Function_call_interrupts
      6907 ± 32%     -53.8%       3194 ± 17%  interrupts.CPU25.NMI:Non-maskable_interrupts
      6907 ± 32%     -53.8%       3194 ± 17%  interrupts.CPU25.PMI:Performance_monitoring_interrupts
    170.50 ± 51%     -56.7%      73.75 ± 90%  interrupts.CPU25.RES:Rescheduling_interrupts
    596.50 ± 39%     -71.8%     168.00 ±171%  interrupts.CPU25.TLB:TLB_shootdowns
      3916 ± 30%     -45.6%       2130 ± 32%  interrupts.CPU3.NMI:Non-maskable_interrupts
      3916 ± 30%     -45.6%       2130 ± 32%  interrupts.CPU3.PMI:Performance_monitoring_interrupts
      5969 ± 25%     -58.5%       2477 ± 46%  interrupts.CPU34.NMI:Non-maskable_interrupts
      5969 ± 25%     -58.5%       2477 ± 46%  interrupts.CPU34.PMI:Performance_monitoring_interrupts
      1345 ± 78%     -86.7%     179.50 ±172%  interrupts.CPU34.TLB:TLB_shootdowns
      6131 ± 31%     -49.0%       3129 ± 36%  interrupts.CPU4.NMI:Non-maskable_interrupts
      6131 ± 31%     -49.0%       3129 ± 36%  interrupts.CPU4.PMI:Performance_monitoring_interrupts
    722.50 ±  4%     -52.0%     346.50 ±100%  interrupts.CPU4.TLB:TLB_shootdowns
      1526 ±  5%     -27.1%       1112 ± 23%  interrupts.CPU40.CAL:Function_call_interrupts
      7314 ± 24%     -56.7%       3166 ± 35%  interrupts.CPU40.NMI:Non-maskable_interrupts
      7314 ± 24%     -56.7%       3166 ± 35%  interrupts.CPU40.PMI:Performance_monitoring_interrupts
      5411 ± 31%     -28.8%       3853 ± 14%  interrupts.CPU46.NMI:Non-maskable_interrupts
      5411 ± 31%     -28.8%       3853 ± 14%  interrupts.CPU46.PMI:Performance_monitoring_interrupts


                                                                                
                              will-it-scale.24.threads                          
                                                                                
  7e+06 +-------------------------------------------------------------------+   
        |..+..+..+.+     +..+..+..+.+     O     O                           |   
  6e+06 |-+O  O  O :  O  :  O  O  O O                O  O  O  O  O  O O  O  |   
        |          :     :                                                  |   
  5e+06 |-+         :   :                                                   |   
        |           :   :                                                   |   
  4e+06 |-+         :   :                                                   |   
        |           :   :                                                   |   
  3e+06 |-+          : :                                                    |   
        |            : :                                                    |   
  2e+06 |-+          : :                                                    |   
        |            : :                                                    |   
  1e+06 |-+           :                                                     |   
        |             :                                                     |   
      0 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            will-it-scale.per_thread_ops                        
                                                                                
  300000 +------------------------------------------------------------------+   
         |..+..+.+..+     +..+.+..+..+                                      |   
  250000 |-+O  O O  :  O  :  O O  O  O     O    O     O O  O  O  O  O O  O  |   
         |          :     :                                                 |   
         |           :   :                                                  |   
  200000 |-+         :   :                                                  |   
         |           :   :                                                  |   
  150000 |-+         :   :                                                  |   
         |            : :                                                   |   
  100000 |-+          : :                                                   |   
         |            : :                                                   |   
         |            : :                                                   |   
   50000 |-+           :                                                    |   
         |             :                                                    |   
       0 +------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               will-it-scale.workload                           
                                                                                
  7e+06 +-------------------------------------------------------------------+   
        |..+..+..+.+     +..+..+..+.+     O     O                           |   
  6e+06 |-+O  O  O :  O  :  O  O  O O                O  O  O  O  O  O O  O  |   
        |          :     :                                                  |   
  5e+06 |-+         :   :                                                   |   
        |           :   :                                                   |   
  4e+06 |-+         :   :                                                   |   
        |           :   :                                                   |   
  3e+06 |-+          : :                                                    |   
        |            : :                                                    |   
  2e+06 |-+          : :                                                    |   
        |            : :                                                    |   
  1e+06 |-+           :                                                     |   
        |             :                                                     |   
      0 +-------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-ivb-2ep1: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e

commit: 
  ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
  d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")

ea6f043fc9847e67 d55564cfc222326e944893eff0c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  14927808            -6.2%   14002190        will-it-scale.48.processes
    310995            -6.2%     291711        will-it-scale.per_process_ops
  14927808            -6.2%   14002190        will-it-scale.workload
    873.22 ±  2%      -4.2%     836.55        boot-time.idle
     28240 ±  2%      +3.7%      29282        proc-vmstat.nr_slab_unreclaimable
      6829 ±  3%     -12.7%       5965 ±  4%  numa-meminfo.node0.KernelStack
      5160 ±  5%     +17.4%       6057 ±  4%  numa-meminfo.node1.KernelStack
     29987 ± 12%     -16.0%      25186 ±  9%  softirqs.CPU46.RCU
     28923 ±  5%     -11.9%      25496 ±  6%  softirqs.CPU9.RCU
      6829 ±  3%     -12.6%       5965 ±  4%  numa-vmstat.node0.nr_kernel_stack
      5160 ±  5%     +17.4%       6058 ±  4%  numa-vmstat.node1.nr_kernel_stack
    476376 ± 20%     +30.7%     622825 ± 11%  numa-vmstat.node1.numa_local
      1135 ±  7%     +22.6%       1391 ±  3%  slabinfo.dmaengine-unmap-16.active_objs
      1135 ±  7%     +22.6%       1391 ±  3%  slabinfo.dmaengine-unmap-16.num_objs
    857.50 ±  5%     +15.0%     986.50 ±  2%  slabinfo.task_group.active_objs
    857.50 ±  5%     +15.0%     986.50 ±  2%  slabinfo.task_group.num_objs
     98.79 ± 10%     +16.9%     115.50 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
     63.89 ± 14%     +23.3%      78.81 ± 16%  sched_debug.cfs_rq:/.util_avg.stddev
    745060 ±  7%     -14.8%     634464 ±  8%  sched_debug.cpu.avg_idle.avg
   1273832 ± 17%     -18.5%    1038314 ±  6%  sched_debug.cpu.avg_idle.max
      2154 ± 10%    +188.1%       6207 ±101%  sched_debug.cpu.avg_idle.min
      0.09 ± 29%     +57.1%       0.14 ± 13%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      0.77 ± 16%     -50.5%       0.38 ± 25%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.stop_one_cpu
      6.77 ±  6%     +19.6%       8.09 ±  6%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
      7.24 ±  6%     +15.0%       8.33 ±  5%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
    118.91 ± 15%     -55.8%      52.50 ± 15%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      5138 ±  2%     +22.2%       6278 ± 13%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
      0.03 ± 57%    +228.3%       0.11 ± 42%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_irq_work
    717.87 ±173%    +106.8%       1484 ± 99%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
      0.48 ± 25%     -50.8%       0.23 ± 39%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
      0.06 ± 48%    +290.5%       0.22 ± 30%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_irq_work
    118.91 ± 15%     -55.8%      52.50 ± 15%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      1397 ±173%    +112.0%       2962 ± 99%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
      5138 ±  2%     +22.2%       6278 ± 13%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
     73378            +3.5%      75925        interrupts.CAL:Function_call_interrupts
      6339 ± 30%     -34.7%       4142        interrupts.CPU1.NMI:Non-maskable_interrupts
      6339 ± 30%     -34.7%       4142        interrupts.CPU1.PMI:Performance_monitoring_interrupts
      1109 ± 39%     -33.4%     739.00 ±  5%  interrupts.CPU1.RES:Rescheduling_interrupts
    596.75 ± 66%     -42.5%     343.00 ±  2%  interrupts.CPU10.RES:Rescheduling_interrupts
      4903 ± 26%     +55.2%       7610 ± 14%  interrupts.CPU12.NMI:Non-maskable_interrupts
      4903 ± 26%     +55.2%       7610 ± 14%  interrupts.CPU12.PMI:Performance_monitoring_interrupts
      1485 ± 46%     -36.3%     946.00 ± 12%  interrupts.CPU13.RES:Rescheduling_interrupts
    900.50 ± 16%     +99.1%       1792 ± 10%  interrupts.CPU2.RES:Rescheduling_interrupts
    396.50 ±  7%     -13.6%     342.75 ±  3%  interrupts.CPU33.RES:Rescheduling_interrupts
      7258 ± 24%     -28.8%       5171 ± 34%  interrupts.CPU34.NMI:Non-maskable_interrupts
      7258 ± 24%     -28.8%       5171 ± 34%  interrupts.CPU34.PMI:Performance_monitoring_interrupts
    860.25            +7.4%     923.75 ±  4%  interrupts.CPU44.CAL:Function_call_interrupts
    327.00 ±  3%     +22.7%     401.25 ± 13%  interrupts.CPU45.RES:Rescheduling_interrupts
      1708 ± 32%     -34.8%       1114 ± 20%  interrupts.CPU5.CAL:Function_call_interrupts
 3.377e+10            +9.8%  3.708e+10        perf-stat.i.branch-instructions
      0.29            -0.0        0.25        perf-stat.i.branch-miss-rate%
  94797779            -8.5%   86775592        perf-stat.i.branch-misses
 3.762e+10            -1.3%  3.714e+10        perf-stat.i.dTLB-loads
 2.076e+10            +2.5%  2.127e+10        perf-stat.i.dTLB-stores
  13777539           -13.2%   11957147 ±  3%  perf-stat.i.iTLB-load-misses
     12274           +15.7%      14203 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      1920            +3.6%       1990        perf-stat.i.metric.M/sec
      0.28            -0.0        0.23        perf-stat.overall.branch-miss-rate%
     12281           +15.6%      14199 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
   3412651            +6.8%    3645734        perf-stat.overall.path-length
 3.365e+10            +9.8%  3.695e+10        perf-stat.ps.branch-instructions
  94514447            -8.5%   86507575        perf-stat.ps.branch-misses
 3.749e+10            -1.3%  3.701e+10        perf-stat.ps.dTLB-loads
 2.069e+10            +2.5%  2.119e+10        perf-stat.ps.dTLB-stores
  13728170           -13.2%   11914029 ±  3%  perf-stat.ps.iTLB-load-misses
     33.31            -1.8       31.51        perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     73.13            -0.8       72.34        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      8.05            -0.4        7.66        perf-profile.calltrace.cycles-pp.testcase
      4.26            -0.3        3.92        perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.85            -0.3        5.60        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__poll
      2.68 ±  5%      -0.2        2.45 ±  3%  perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.96 ±  2%      -0.2        2.77 ±  2%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      1.51            -0.2        1.35        perf-profile.calltrace.cycles-pp.__kmalloc.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.80 ±  4%      +0.0        0.84        perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      3.99            +0.1        4.13        perf-profile.calltrace.cycles-pp.__entry_text_start.__poll
     83.08            +0.2       83.24        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
     90.96            +0.4       91.34        perf-profile.calltrace.cycles-pp.__poll
     76.51            +0.6       77.14        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     75.59            +0.8       76.37        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      0.00            +7.0        6.97        perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     31.93            -1.9       30.00        perf-profile.children.cycles-pp.__fget_light
      8.13            -0.4        7.73        perf-profile.children.cycles-pp.testcase
      4.21            -0.3        3.89        perf-profile.children.cycles-pp.__fdget
      2.80 ±  4%      -0.3        2.54 ±  3%  perf-profile.children.cycles-pp.__check_object_size
      5.89            -0.3        5.64        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      3.00 ±  2%      -0.2        2.80 ±  2%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      1.60            -0.2        1.43        perf-profile.children.cycles-pp.__kmalloc
      0.16 ±  2%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.36 ±  5%      -0.1        0.29 ± 13%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.52            -0.1        0.46 ±  3%  perf-profile.children.cycles-pp.__check_heap_object
      0.18 ±  4%      -0.0        0.13 ±  3%  perf-profile.children.cycles-pp._cond_resched
      0.10 ±  5%      -0.0        0.08        perf-profile.children.cycles-pp.__x86_retpoline_rax
      0.19 ±  3%      +0.0        0.21 ±  2%  perf-profile.children.cycles-pp.poll_freewait
      0.86 ±  3%      +0.0        0.90        perf-profile.children.cycles-pp.__might_fault
      4.00            +0.1        4.15        perf-profile.children.cycles-pp.__entry_text_start
     83.21            +0.2       83.40        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     91.58            +0.3       91.90        perf-profile.children.cycles-pp.__poll
     76.77            +0.5       77.31        perf-profile.children.cycles-pp.do_syscall_64
     75.63            +0.8       76.41        perf-profile.children.cycles-pp.__x64_sys_poll
     74.72            +0.9       75.61        perf-profile.children.cycles-pp.do_sys_poll
      0.00            +5.8        5.77        perf-profile.children.cycles-pp.__put_user_nocheck_2
     29.62            -1.8       27.84        perf-profile.self.cycles-pp.__fget_light
      7.89            -0.4        7.51        perf-profile.self.cycles-pp.testcase
      5.78            -0.2        5.53        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      2.96 ±  2%      -0.2        2.75 ±  2%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      2.15            -0.2        1.96 ±  2%  perf-profile.self.cycles-pp.__fdget
      0.61 ±  2%      -0.2        0.44        perf-profile.self.cycles-pp.do_syscall_64
      1.09 ±  7%      -0.1        0.98 ±  4%  perf-profile.self.cycles-pp.__check_object_size
      0.73 ±  2%      -0.1        0.63 ±  3%  perf-profile.self.cycles-pp.__x64_sys_poll
      0.57 ±  3%      -0.1        0.47 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.85 ±  2%      -0.1        0.77 ±  2%  perf-profile.self.cycles-pp.__kmalloc
      0.30 ±  4%      -0.1        0.24 ± 12%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.49            -0.0        0.45 ±  2%  perf-profile.self.cycles-pp.__check_heap_object
      0.09            -0.0        0.06        perf-profile.self.cycles-pp._cond_resched
      0.15 ±  3%      -0.0        0.14 ±  5%  perf-profile.self.cycles-pp.poll_select_set_timeout
      3.55            +0.2        3.74        perf-profile.self.cycles-pp.__entry_text_start
      0.00            +3.6        3.58        perf-profile.self.cycles-pp.__put_user_nocheck_2



***************************************************************************************************
lkp-csl-2ap1: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
  4k/gcc-9/performance/1SSD/btrfs/sync/x86_64-rhel-8.3/8/debian-10.4-x86_64-20200603.cgz/300s/randwrite/lkp-csl-2ap1/256g/fio-basic/0x4003003

commit: 
  ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
  d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")

ea6f043fc9847e67 d55564cfc222326e944893eff0c 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:2          -50%            :2     kmsg.ACPI_Error
          0:2           -1%           0:2     perf-profile.children.cycles-pp.error_entry



***************************************************************************************************
lkp-csl-2ap3: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap3/poll2/will-it-scale/0x5002f01

commit: 
  ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
  d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")

ea6f043fc9847e67 d55564cfc222326e944893eff0c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  49799766            -6.8%   46397591        will-it-scale.192.processes
    259373            -6.8%     241653        will-it-scale.per_process_ops
  49799766            -6.8%   46397591        will-it-scale.workload
      5355 ±  3%      -2.8%       5203        boot-time.idle
    219459 ±  5%     -10.0%     197460 ±  2%  numa-numastat.node2.local_node
     20202 ± 33%     +53.8%      31071        numa-numastat.node2.other_node
      5399 ± 13%     +25.8%       6794        slabinfo.khugepaged_mm_slot.active_objs
      5399 ± 13%     +25.8%       6794        slabinfo.khugepaged_mm_slot.num_objs
     27584 ±  3%      +4.4%      28788        proc-vmstat.nr_active_anon
     31838 ±  3%      +3.9%      33095        proc-vmstat.nr_shmem
     27584 ±  3%      +4.4%      28788        proc-vmstat.nr_zone_active_anon
      4438 ± 96%     -97.2%     123.12 ± 23%  sched_debug.cfs_rq:/.load_avg.max
    322.01 ± 95%     -96.2%      12.19 ± 24%  sched_debug.cfs_rq:/.load_avg.stddev
    161.08 ±  3%     -11.3%     142.88 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      2008 ± 52%     -89.0%     221.50 ± 70%  numa-meminfo.node2.Active
      2008 ± 52%     -89.0%     221.50 ± 70%  numa-meminfo.node2.Active(anon)
      9747 ± 10%     -21.8%       7622 ± 11%  numa-meminfo.node2.PageTables
     79271 ± 36%     +77.4%     140623 ± 25%  numa-meminfo.node3.AnonPages
     87506 ± 36%     +68.2%     147211 ± 24%  numa-meminfo.node3.Inactive
     87506 ± 36%     +68.2%     147211 ± 24%  numa-meminfo.node3.Inactive(anon)
    278145            +6.8%     297050 ±  6%  numa-meminfo.node3.Unevictable
    501.75 ± 52%     -89.0%      55.00 ± 71%  numa-vmstat.node2.nr_active_anon
      2434 ± 10%     -21.7%       1905 ± 11%  numa-vmstat.node2.nr_page_table_pages
    501.75 ± 52%     -89.0%      55.00 ± 71%  numa-vmstat.node2.nr_zone_active_anon
    638194 ± 13%     -22.7%     493421 ±  8%  numa-vmstat.node2.numa_hit
    525818 ± 16%     -29.6%     369990 ± 10%  numa-vmstat.node2.numa_local
    112375 ±  5%      +9.8%     123431        numa-vmstat.node2.numa_other
     19778 ± 36%     +78.0%      35206 ± 25%  numa-vmstat.node3.nr_anon_pages
     21798 ± 36%     +69.4%      36921 ± 24%  numa-vmstat.node3.nr_inactive_anon
     69536            +6.8%      74262 ±  6%  numa-vmstat.node3.nr_unevictable
     21798 ± 36%     +69.4%      36921 ± 24%  numa-vmstat.node3.nr_zone_inactive_anon
     69536            +6.8%      74262 ±  6%  numa-vmstat.node3.nr_zone_unevictable
    307.75           +31.2%     403.75 ± 31%  interrupts.CPU105.RES:Rescheduling_interrupts
    305.75           +46.0%     446.25 ± 45%  interrupts.CPU114.RES:Rescheduling_interrupts
    318.00 ±  4%     +82.6%     580.75 ± 69%  interrupts.CPU12.RES:Rescheduling_interrupts
      2428 ± 15%     +41.3%       3433 ± 18%  interrupts.CPU122.CAL:Function_call_interrupts
    434.75 ± 34%     -29.3%     307.25        interrupts.CPU136.RES:Rescheduling_interrupts
    363.00 ±  5%     +32.0%     479.25 ± 33%  interrupts.CPU191.RES:Rescheduling_interrupts
      6365 ± 33%     -17.3%       5263 ± 34%  interrupts.CPU23.NMI:Non-maskable_interrupts
      6365 ± 33%     -17.3%       5263 ± 34%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
    324.25 ±  3%     +18.7%     384.75 ± 18%  interrupts.CPU3.RES:Rescheduling_interrupts
    427.25 ± 26%     -26.9%     312.50 ±  3%  interrupts.CPU39.RES:Rescheduling_interrupts
      6491 ± 33%     -17.6%       5347 ± 34%  interrupts.CPU78.NMI:Non-maskable_interrupts
      6491 ± 33%     -17.6%       5347 ± 34%  interrupts.CPU78.PMI:Performance_monitoring_interrupts
    326.25 ±  4%      -4.8%     310.75 ±  4%  interrupts.CPU83.RES:Rescheduling_interrupts
    362.50 ± 13%     -13.4%     314.00 ±  4%  interrupts.CPU88.RES:Rescheduling_interrupts
      8654           -38.2%       5350 ± 34%  interrupts.CPU93.NMI:Non-maskable_interrupts
      8654           -38.2%       5350 ± 34%  interrupts.CPU93.PMI:Performance_monitoring_interrupts
    411.00 ± 16%     -19.8%     329.50 ±  4%  interrupts.CPU95.RES:Rescheduling_interrupts
    165.25 ±  6%     +32.8%     219.50 ±  3%  interrupts.IWI:IRQ_work_interrupts
      0.08 ±  9%     -46.2%       0.04 ± 13%  perf-stat.i.MPKI
 1.124e+11            +9.0%  1.226e+11        perf-stat.i.branch-instructions
      0.28            -0.1        0.23        perf-stat.i.branch-miss-rate%
  3.02e+08           -13.0%  2.626e+08        perf-stat.i.branch-misses
     11.73            -1.9        9.87        perf-stat.i.cache-miss-rate%
   4146579 ±  2%     -65.7%    1420631 ±  3%  perf-stat.i.cache-misses
  35384957 ±  2%     -60.1%   14124161 ±  2%  perf-stat.i.cache-references
    141836 ±  2%    +219.1%     452664 ±  3%  perf-stat.i.cycles-between-cache-misses
    628080 ±  2%     -29.6%     441936 ±  5%  perf-stat.i.dTLB-load-misses
 1.284e+11            -2.2%  1.255e+11        perf-stat.i.dTLB-loads
 5.923e+10            +3.1%  6.108e+10        perf-stat.i.dTLB-stores
  22557203           -12.5%   19727021        perf-stat.i.iTLB-load-misses
     25065           +12.9%      28294        perf-stat.i.instructions-per-iTLB-miss
      1563            +3.0%       1610        perf-stat.i.metric.M/sec
   1187563 ±  3%     -77.5%     266628        perf-stat.i.node-load-misses
    136499 ±  7%     -70.9%      39734 ±  2%  perf-stat.i.node-loads
     98.41            -3.3       95.10        perf-stat.i.node-store-miss-rate%
    387351 ±  3%     -73.3%     103454        perf-stat.i.node-store-misses
      9110 ±  7%     +10.6%      10079 ±  7%  perf-stat.i.node-stores
      0.06 ±  2%     -59.7%       0.03 ±  2%  perf-stat.overall.MPKI
      0.27            -0.1        0.21        perf-stat.overall.branch-miss-rate%
     11.71            -1.7       10.03        perf-stat.overall.cache-miss-rate%
    138134 ±  2%    +189.6%     400066 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.00 ±  2%      -0.0        0.00        perf-stat.overall.dTLB-load-miss-rate%
     24933           +13.7%      28356        perf-stat.overall.instructions-per-iTLB-miss
     89.60            -3.3       86.26        perf-stat.overall.node-load-miss-rate%
     97.67            -6.6       91.03        perf-stat.overall.node-store-miss-rate%
   3404327            +6.7%    3632027        perf-stat.overall.path-length
 1.121e+11            +9.0%  1.222e+11        perf-stat.ps.branch-instructions
  3.01e+08           -13.0%  2.618e+08        perf-stat.ps.branch-misses
   4136846 ±  2%     -65.6%    1421212 ±  3%  perf-stat.ps.cache-misses
  35332816 ±  2%     -59.9%   14168003 ±  2%  perf-stat.ps.cache-references
    632868 ±  2%     -28.1%     454876        perf-stat.ps.dTLB-load-misses
  1.28e+11            -2.2%  1.251e+11        perf-stat.ps.dTLB-loads
 5.902e+10            +3.1%  6.087e+10        perf-stat.ps.dTLB-stores
  22483939           -12.6%   19660586        perf-stat.ps.iTLB-load-misses
   1183807 ±  3%     -77.5%     265769        perf-stat.ps.node-load-misses
    137518 ±  7%     -69.2%      42343 ±  2%  perf-stat.ps.node-loads
    386094 ±  3%     -73.3%     103120        perf-stat.ps.node-store-misses
      9169 ±  7%     +10.9%      10169 ±  6%  perf-stat.ps.node-stores
     95.69            -0.2       95.47        perf-profile.calltrace.cycles-pp.__poll
      2.68 ±  2%      -0.1        2.58        perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.76            -0.0        0.75        perf-profile.calltrace.cycles-pp.__entry_text_start.__poll
      0.53            +0.1        0.59        perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      1.17            +0.1        1.23        perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.62            +0.1        2.70        perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.74            +0.1        0.85        perf-profile.calltrace.cycles-pp.__kmalloc.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.80 ±  2%      +0.2        4.04 ±  3%  perf-profile.calltrace.cycles-pp.testcase
     20.09 ±  2%      +2.8       22.93        perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     91.94            +5.2       97.17        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      0.00           +38.3       38.28        perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     96.15            -0.2       95.91        perf-profile.children.cycles-pp.__poll
      2.68            -0.1        2.56        perf-profile.children.cycles-pp.__fdget
      0.22            -0.0        0.18 ±  2%  perf-profile.children.cycles-pp.__might_sleep
      0.15 ±  3%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.poll_freewait
      0.09            -0.0        0.07        perf-profile.children.cycles-pp.poll_select_set_timeout
      0.18            -0.0        0.17        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.29 ±  4%      +0.0        0.32 ±  3%  perf-profile.children.cycles-pp.__check_heap_object
      0.10            +0.0        0.13 ±  3%  perf-profile.children.cycles-pp.check_stack_object
      0.57            +0.1        0.62        perf-profile.children.cycles-pp.__might_fault
      0.32 ±  3%      +0.1        0.39        perf-profile.children.cycles-pp.___might_sleep
      1.20            +0.1        1.28        perf-profile.children.cycles-pp.__check_object_size
      2.65            +0.1        2.74        perf-profile.children.cycles-pp._copy_from_user
      0.79            +0.1        0.90        perf-profile.children.cycles-pp.__kmalloc
      3.85 ±  2%      +0.2        4.09 ±  3%  perf-profile.children.cycles-pp.testcase
     18.83 ±  2%      +2.9       21.70        perf-profile.children.cycles-pp.__fget_light
      0.00           +43.6       43.61        perf-profile.children.cycles-pp.__put_user_nocheck_2
     68.36           -45.9       22.50        perf-profile.self.cycles-pp.do_sys_poll
      1.34 ±  2%      -0.1        1.26        perf-profile.self.cycles-pp.__fdget
      0.20            -0.0        0.17 ±  3%  perf-profile.self.cycles-pp.__might_sleep
      0.21 ±  2%      -0.0        0.18 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.09 ±  4%      -0.0        0.07        perf-profile.self.cycles-pp.poll_select_set_timeout
      0.11 ±  3%      -0.0        0.10        perf-profile.self.cycles-pp.poll_freewait
      0.09            +0.0        0.11 ±  3%  perf-profile.self.cycles-pp.check_stack_object
      0.30            +0.0        0.33        perf-profile.self.cycles-pp.__check_object_size
      0.18 ±  2%      +0.0        0.21 ±  2%  perf-profile.self.cycles-pp._copy_from_user
      0.18 ±  3%      +0.0        0.22        perf-profile.self.cycles-pp.__might_fault
      0.32 ±  3%      +0.1        0.38        perf-profile.self.cycles-pp.___might_sleep
      0.43            +0.1        0.52        perf-profile.self.cycles-pp.__kmalloc
      3.79 ±  2%      +0.2        4.02 ±  3%  perf-profile.self.cycles-pp.testcase
     17.33 ±  2%      +2.9       20.27        perf-profile.self.cycles-pp.__fget_light
      0.00           +42.6       42.61        perf-profile.self.cycles-pp.__put_user_nocheck_2



***************************************************************************************************
lkp-hsw-4ex1: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/poll2/will-it-scale/0x16

commit: 
  ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
  d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")

ea6f043fc9847e67 d55564cfc222326e944893eff0c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  42577786            -7.3%   39477406        will-it-scale.144.processes
    295678            -7.3%     274148        will-it-scale.per_process_ops
  42577786            -7.3%   39477406        will-it-scale.workload
     57721            -1.2%      57029        proc-vmstat.nr_slab_unreclaimable
     19.00            -5.3%      18.00        vmstat.cpu.us
     90088 ± 79%     -70.3%      26733 ± 15%  numa-meminfo.node1.AnonPages
     94290 ± 72%     -63.4%      34523 ± 23%  numa-meminfo.node1.Inactive
     94178 ± 72%     -63.3%      34518 ± 23%  numa-meminfo.node1.Inactive(anon)
      3764 ± 11%     -19.7%       3023 ±  2%  numa-meminfo.node1.PageTables
     20104 ± 13%     -15.5%      16993 ±  9%  softirqs.CPU0.RCU
     18905 ±  6%      -8.6%      17277 ±  5%  softirqs.CPU136.RCU
     16811 ±  4%     -12.0%      14790 ±  4%  softirqs.CPU71.RCU
     19562 ±  3%      -9.7%      17666 ±  7%  softirqs.CPU97.RCU
     22522 ± 79%     -70.2%       6705 ± 15%  numa-vmstat.node1.nr_anon_pages
     23544 ± 72%     -63.3%       8649 ± 23%  numa-vmstat.node1.nr_inactive_anon
    941.00 ± 11%     -19.7%     756.00 ±  2%  numa-vmstat.node1.nr_page_table_pages
     23544 ± 72%     -63.3%       8649 ± 23%  numa-vmstat.node1.nr_zone_inactive_anon
    419078 ± 10%     -17.1%     347285 ±  3%  numa-vmstat.node1.numa_local
      0.05 ±  4%     +12.6%       0.06 ±  5%  sched_debug.cfs_rq:/.nr_running.stddev
     39.42 ±100%    +122.5%      87.71 ±  2%  sched_debug.cfs_rq:/.removed.runnable_avg.max
     39.38 ±100%    +122.8%      87.71 ±  2%  sched_debug.cfs_rq:/.removed.util_avg.max
     92.35 ±  6%     +10.9%     102.44 ±  5%  sched_debug.cfs_rq:/.runnable_avg.stddev
    732.50 ±  7%     +22.2%     895.42 ±  7%  sched_debug.cfs_rq:/.util_est_enqueued.max
     89.39 ±  8%     +50.5%     134.49 ± 14%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      2369 ±  3%     -13.0%       2062 ±  5%  slabinfo.PING.active_objs
      2369 ±  3%     -13.0%       2062 ±  5%  slabinfo.PING.num_objs
      1124 ±  7%     -11.3%     997.75 ±  5%  slabinfo.file_lock_cache.active_objs
      1124 ±  7%     -11.3%     997.75 ±  5%  slabinfo.file_lock_cache.num_objs
      2775 ±  5%     -20.4%       2208 ±  7%  slabinfo.fsnotify_mark_connector.active_objs
      2775 ±  5%     -20.4%       2208 ±  7%  slabinfo.fsnotify_mark_connector.num_objs
     11030 ±  6%      -8.7%      10069 ±  5%  slabinfo.pde_opener.active_objs
     11030 ±  6%      -8.7%      10069 ±  5%  slabinfo.pde_opener.num_objs
    425.00 ±100%    +116.8%     921.25 ±  3%  syscalls.sys_close.med
      4507           +13.2%       5102        syscalls.sys_poll.min
  17548777 ±  2%  +2.3e+06    19877523 ±  4%  syscalls.sys_poll.noise.100%
  22979833 ±  3%  +4.7e+06    27645541 ±  4%  syscalls.sys_poll.noise.2%
  17799035 ±  2%  +2.4e+06    20156928 ±  4%  syscalls.sys_poll.noise.25%
  20161873 ±  3%  +3.1e+06    23286940 ±  4%  syscalls.sys_poll.noise.5%
  17648058 ±  2%  +2.4e+06    20015410 ±  4%  syscalls.sys_poll.noise.50%
  17585605 ±  2%  +2.4e+06    19958729 ±  4%  syscalls.sys_poll.noise.75%
      0.11 ± 19%     +35.8%       0.15 ± 16%  perf-stat.i.MPKI
 9.917e+10            +8.0%  1.071e+11        perf-stat.i.branch-instructions
      0.29            -0.0        0.25        perf-stat.i.branch-miss-rate%
 2.791e+08            -8.5%  2.554e+08        perf-stat.i.branch-misses
 4.037e+11            -1.7%  3.968e+11        perf-stat.i.cpu-cycles
   1336350            -8.7%    1220687 ± 13%  perf-stat.i.cycles-between-cache-misses
 1.114e+11            -2.9%  1.082e+11        perf-stat.i.dTLB-loads
      0.09 ± 15%      -0.0        0.07 ±  2%  perf-stat.i.dTLB-store-miss-rate%
  50453634 ± 15%     -19.9%   40407884 ±  2%  perf-stat.i.dTLB-store-misses
 5.934e+10            +1.2%  6.004e+10        perf-stat.i.dTLB-stores
  45611355            +6.4%   48521131 ±  2%  perf-stat.i.iTLB-load-misses
 4.969e+11            -1.2%  4.908e+11        perf-stat.i.instructions
     10882            -7.0%      10125 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      2.80            -1.7%       2.75        perf-stat.i.metric.GHz
      1873            +2.0%       1911        perf-stat.i.metric.M/sec
      0.28            -0.0        0.24        perf-stat.overall.branch-miss-rate%
      0.08 ± 15%      -0.0        0.07 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
     10891            -7.0%      10124        perf-stat.overall.instructions-per-iTLB-miss
   3511981            +6.8%    3750127        perf-stat.overall.path-length
 9.878e+10            +8.1%  1.067e+11        perf-stat.ps.branch-instructions
 2.781e+08            -8.5%  2.545e+08        perf-stat.ps.branch-misses
 4.021e+11            -1.7%  3.953e+11        perf-stat.ps.cpu-cycles
 1.109e+11            -2.8%  1.078e+11        perf-stat.ps.dTLB-loads
  50236046 ± 15%     -19.9%   40254632 ±  2%  perf-stat.ps.dTLB-store-misses
 5.911e+10            +1.2%  5.982e+10        perf-stat.ps.dTLB-stores
  45438239            +6.3%   48313052 ±  2%  perf-stat.ps.iTLB-load-misses
 4.949e+11            -1.2%   4.89e+11        perf-stat.ps.instructions
 1.495e+14            -1.0%   1.48e+14        perf-stat.total.instructions
      6804 ± 24%     -30.0%       4763 ± 35%  interrupts.CPU103.NMI:Non-maskable_interrupts
      6804 ± 24%     -30.0%       4763 ± 35%  interrupts.CPU103.PMI:Performance_monitoring_interrupts
    349.75 ± 14%     -13.9%     301.25        interrupts.CPU104.RES:Rescheduling_interrupts
      6859 ± 24%     -34.0%       4528 ± 24%  interrupts.CPU108.NMI:Non-maskable_interrupts
      6859 ± 24%     -34.0%       4528 ± 24%  interrupts.CPU108.PMI:Performance_monitoring_interrupts
      7894           -38.3%       4868 ± 35%  interrupts.CPU114.NMI:Non-maskable_interrupts
      7894           -38.3%       4868 ± 35%  interrupts.CPU114.PMI:Performance_monitoring_interrupts
      5905 ± 33%     -17.8%       4855 ± 34%  interrupts.CPU119.NMI:Non-maskable_interrupts
      5905 ± 33%     -17.8%       4855 ± 34%  interrupts.CPU119.PMI:Performance_monitoring_interrupts
      6873 ± 24%     -29.6%       4841 ± 34%  interrupts.CPU121.NMI:Non-maskable_interrupts
      6873 ± 24%     -29.6%       4841 ± 34%  interrupts.CPU121.PMI:Performance_monitoring_interrupts
      5938 ± 33%     -34.0%       3920        interrupts.CPU129.NMI:Non-maskable_interrupts
      5938 ± 33%     -34.0%       3920        interrupts.CPU129.PMI:Performance_monitoring_interrupts
      6981 ± 24%     -44.0%       3909        interrupts.CPU131.NMI:Non-maskable_interrupts
      6981 ± 24%     -44.0%       3909        interrupts.CPU131.PMI:Performance_monitoring_interrupts
      7944           -41.9%       4612 ± 26%  interrupts.CPU135.NMI:Non-maskable_interrupts
      7944           -41.9%       4612 ± 26%  interrupts.CPU135.PMI:Performance_monitoring_interrupts
      5952 ± 33%     -17.8%       4894 ± 34%  interrupts.CPU136.NMI:Non-maskable_interrupts
      5952 ± 33%     -17.8%       4894 ± 34%  interrupts.CPU136.PMI:Performance_monitoring_interrupts
      5939 ± 33%     -33.9%       3923        interrupts.CPU137.NMI:Non-maskable_interrupts
      5939 ± 33%     -33.9%       3923        interrupts.CPU137.PMI:Performance_monitoring_interrupts
      6978 ± 24%     -29.6%       4913 ± 34%  interrupts.CPU138.NMI:Non-maskable_interrupts
      6978 ± 24%     -29.6%       4913 ± 34%  interrupts.CPU138.PMI:Performance_monitoring_interrupts
      6946 ± 25%     -43.9%       3898        interrupts.CPU142.NMI:Non-maskable_interrupts
      6946 ± 25%     -43.9%       3898        interrupts.CPU142.PMI:Performance_monitoring_interrupts
      7284 ± 12%     -24.8%       5474 ± 25%  interrupts.CPU21.NMI:Non-maskable_interrupts
      7284 ± 12%     -24.8%       5474 ± 25%  interrupts.CPU21.PMI:Performance_monitoring_interrupts
    836.25 ± 28%     -36.9%     528.00 ± 45%  interrupts.CPU29.CAL:Function_call_interrupts
      5876 ± 33%     -18.6%       4785 ± 34%  interrupts.CPU29.NMI:Non-maskable_interrupts
      5876 ± 33%     -18.6%       4785 ± 34%  interrupts.CPU29.PMI:Performance_monitoring_interrupts
      6560 ± 24%     -27.1%       4783 ± 34%  interrupts.CPU33.NMI:Non-maskable_interrupts
      6560 ± 24%     -27.1%       4783 ± 34%  interrupts.CPU33.PMI:Performance_monitoring_interrupts
      6840 ± 24%     -39.2%       4158 ± 14%  interrupts.CPU35.NMI:Non-maskable_interrupts
      6840 ± 24%     -39.2%       4158 ± 14%  interrupts.CPU35.PMI:Performance_monitoring_interrupts
    309.50 ±  2%     +24.2%     384.50 ± 12%  interrupts.CPU37.RES:Rescheduling_interrupts
    331.00 ±  5%     +38.7%     459.00 ± 28%  interrupts.CPU38.RES:Rescheduling_interrupts
      5946 ± 32%     -17.8%       4890 ± 34%  interrupts.CPU41.NMI:Non-maskable_interrupts
      5946 ± 32%     -17.8%       4890 ± 34%  interrupts.CPU41.PMI:Performance_monitoring_interrupts
      1730 ± 11%     -27.1%       1261 ± 25%  interrupts.CPU54.CAL:Function_call_interrupts
    523.75 ±  9%     -23.2%     402.25 ± 13%  interrupts.CPU54.RES:Rescheduling_interrupts
      4207 ±  9%     +86.7%       7854        interrupts.CPU56.NMI:Non-maskable_interrupts
      4207 ±  9%     +86.7%       7854        interrupts.CPU56.PMI:Performance_monitoring_interrupts
    305.75           +91.4%     585.25 ± 64%  interrupts.CPU57.RES:Rescheduling_interrupts
      4590 ± 22%     +70.9%       7844        interrupts.CPU59.NMI:Non-maskable_interrupts
      4590 ± 22%     +70.9%       7844        interrupts.CPU59.PMI:Performance_monitoring_interrupts
      6013 ± 30%     -20.3%       4793 ± 34%  interrupts.CPU7.NMI:Non-maskable_interrupts
      6013 ± 30%     -20.3%       4793 ± 34%  interrupts.CPU7.PMI:Performance_monitoring_interrupts
    437.00 ±  7%     -14.4%     374.25 ±  3%  interrupts.CPU73.RES:Rescheduling_interrupts
      5893 ± 33%     -22.0%       4596 ± 28%  interrupts.CPU80.NMI:Non-maskable_interrupts
      5893 ± 33%     -22.0%       4596 ± 28%  interrupts.CPU80.PMI:Performance_monitoring_interrupts
    853.75 ± 23%     -38.2%     527.75 ± 46%  interrupts.CPU98.CAL:Function_call_interrupts
     26.40            -1.0       25.38        perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.18            -0.7       11.44        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__poll
      7.20 ±  2%      -0.5        6.69 ±  2%  perf-profile.calltrace.cycles-pp.testcase
      5.35            -0.3        5.05 ±  2%  perf-profile.calltrace.cycles-pp.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      4.72            -0.3        4.45 ±  2%  perf-profile.calltrace.cycles-pp.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      5.00            -0.2        4.78        perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.55            -0.2        0.39 ± 57%  perf-profile.calltrace.cycles-pp.ring_buffer_unlock_commit.trace_buffer_unlock_commit_regs.ftrace_syscall_exit.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      1.92            -0.1        1.78        perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.31 ±  2%      -0.1        2.19 ±  3%  perf-profile.calltrace.cycles-pp.trace_buffer_lock_reserve.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.71 ±  4%      -0.1        0.60 ±  5%  perf-profile.calltrace.cycles-pp.__virt_addr_valid.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
      0.81 ±  2%      -0.1        0.72 ±  2%  perf-profile.calltrace.cycles-pp.trace_buffer_unlock_commit_regs.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.60 ±  2%      -0.1        0.53 ±  2%  perf-profile.calltrace.cycles-pp.ring_buffer_unlock_commit.trace_buffer_unlock_commit_regs.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64
      1.13 ±  3%      -0.0        1.09        perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     65.27            +0.4       65.69        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     92.14            +0.6       92.70        perf-profile.calltrace.cycles-pp.__poll
     85.67            +0.6       86.29        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
     73.01            +1.4       74.39        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     67.10            +1.7       68.77        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
      0.00            +6.2        6.22        perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     26.26            -1.4       24.82        perf-profile.children.cycles-pp.__fget_light
     12.24            -0.8       11.49        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      7.27 ±  2%      -0.5        6.76 ±  2%  perf-profile.children.cycles-pp.testcase
      5.37            -0.3        5.07 ±  2%  perf-profile.children.cycles-pp.syscall_trace_enter
      4.78            -0.3        4.50 ±  2%  perf-profile.children.cycles-pp.ftrace_syscall_enter
      4.32            -0.2        4.07        perf-profile.children.cycles-pp.__fdget
      2.03            -0.1        1.89        perf-profile.children.cycles-pp.__check_object_size
      1.59            -0.1        1.48 ±  2%  perf-profile.children.cycles-pp.trace_buffer_unlock_commit_regs
      1.17            -0.1        1.05 ±  2%  perf-profile.children.cycles-pp.ring_buffer_unlock_commit
      0.71 ±  4%      -0.1        0.60 ±  5%  perf-profile.children.cycles-pp.__virt_addr_valid
      0.68 ±  2%      -0.1        0.60        perf-profile.children.cycles-pp.rb_commit
      1.54            -0.1        1.48 ±  2%  perf-profile.children.cycles-pp.__kmalloc
      0.56 ±  2%      -0.1        0.51 ±  4%  perf-profile.children.cycles-pp.memcpy_erms
      0.30 ±  5%      -0.0        0.27 ±  4%  perf-profile.children.cycles-pp.ring_buffer_event_data
      0.10 ±  4%      -0.0        0.08 ±  5%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.33            -0.0        0.30        perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.06            -0.0        0.05        perf-profile.children.cycles-pp.should_failslab
      0.28            +0.0        0.31 ±  2%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.18 ±  2%      +0.0        0.22 ±  3%  perf-profile.children.cycles-pp.poll_freewait
     92.75            +0.5       93.27        perf-profile.children.cycles-pp.__poll
     85.74            +0.6       86.34        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     73.09            +1.4       74.45        perf-profile.children.cycles-pp.do_syscall_64
     67.14            +1.7       68.81        perf-profile.children.cycles-pp.__x64_sys_poll
     66.28            +1.7       68.01        perf-profile.children.cycles-pp.do_sys_poll
      0.00            +5.3        5.34        perf-profile.children.cycles-pp.__put_user_nocheck_2
     24.35            -1.3       23.02        perf-profile.self.cycles-pp.__fget_light
      8.21            -0.7        7.56        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      7.08 ±  3%      -0.5        6.57 ±  2%  perf-profile.self.cycles-pp.testcase
      1.87            -0.2        1.71        perf-profile.self.cycles-pp.__fdget
      0.69 ±  5%      -0.1        0.58 ±  6%  perf-profile.self.cycles-pp.__virt_addr_valid
      0.67 ±  2%      -0.1        0.59 ±  2%  perf-profile.self.cycles-pp.rb_commit
      0.69            -0.1        0.63        perf-profile.self.cycles-pp.__x64_sys_poll
      0.54 ±  3%      -0.1        0.49 ±  3%  perf-profile.self.cycles-pp.memcpy_erms
      0.47            -0.0        0.43 ±  4%  perf-profile.self.cycles-pp.ring_buffer_unlock_commit
      0.65            -0.0        0.60        perf-profile.self.cycles-pp.ftrace_syscall_exit
      0.81            -0.0        0.77 ±  2%  perf-profile.self.cycles-pp.__kmalloc
      0.24 ±  3%      -0.0        0.21 ±  5%  perf-profile.self.cycles-pp.ring_buffer_event_data
      0.27            -0.0        0.25        perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.22            -0.0        0.20 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      0.24            +0.0        0.27 ±  4%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.16 ±  2%      +0.0        0.19 ±  4%  perf-profile.self.cycles-pp.poll_freewait
      0.00            +3.6        3.64        perf-profile.self.cycles-pp.__put_user_nocheck_2





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.9.0-00857-gd55564cfc22232" of type "text/plain" (169998 bytes)

View attachment "job-script" of type "text/plain" (7803 bytes)

View attachment "job.yaml" of type "text/plain" (5430 bytes)

View attachment "reproduce" of type "text/plain" (336 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ