lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200708082849.GM3874@shao2-debian>
Date:   Wed, 8 Jul 2020 16:28:49 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Alexandre Chartre <alexandre.chartre@...cle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [x86/entry/common] 8f159f1dfa: will-it-scale.per_process_ops -6.4%
 regression

Greeting,

FYI, we noticed a -6.4% regression of will-it-scale.per_process_ops due to commit:


commit: 8f159f1dfa1ea29d70a84335fe6a8bd501a9eecd ("x86/entry/common: Protect against instrumentation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 192 threads Cooper Lake with 128G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: lseek1
	cpufreq_governor: performance
	ucode: 0x86000017

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-cpx-4s1/lseek1/will-it-scale/0x86000017

commit: 
  1723be30e4 ("x86/entry: Mark enter_from_user_mode() noinstr")
  8f159f1dfa ("x86/entry/common: Protect against instrumentation")

1723be30e46fbda0 8f159f1dfa1ea29d70a84335fe6 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   9977836            -6.4%    9334708        will-it-scale.per_process_ops
 1.916e+09            -6.4%  1.792e+09        will-it-scale.workload
      1612           -75.1%     401.75 ±173%  meminfo.Mlocked
     38.00            +2.6%      39.00        vmstat.cpu.us
     30383 ± 27%     +51.8%      46125 ± 12%  numa-meminfo.node1.KReclaimable
     30383 ± 27%     +51.8%      46125 ± 12%  numa-meminfo.node1.SReclaimable
     26574 ± 23%     -29.2%      18820 ± 14%  numa-meminfo.node2.KReclaimable
     26574 ± 23%     -29.2%      18820 ± 14%  numa-meminfo.node2.SReclaimable
     82840 ± 12%     -14.1%      71200 ±  4%  numa-meminfo.node2.SUnreclaim
    100.00 ± 26%     -79.0%      21.00 ±173%  numa-vmstat.node1.nr_mlock
      7595 ± 27%     +51.8%      11531 ± 12%  numa-vmstat.node1.nr_slab_reclaimable
    115.50 ± 26%     -81.8%      21.00 ±173%  numa-vmstat.node2.nr_mlock
      6643 ± 23%     -29.2%       4704 ± 14%  numa-vmstat.node2.nr_slab_reclaimable
     20710 ± 12%     -14.1%      17799 ±  4%  numa-vmstat.node2.nr_slab_unreclaimable
     15093 ±  4%     +14.5%      17281 ±  4%  sched_debug.cpu.sched_count.max
      7918 ± 11%     +14.3%       9046 ±  4%  sched_debug.cpu.ttwu_count.max
      0.49 ± 56%     -96.8%       0.02 ±160%  sched_debug.rt_rq:/.rt_time.avg
     93.99 ± 56%     -96.8%       3.00 ±160%  sched_debug.rt_rq:/.rt_time.max
      6.77 ± 56%     -96.8%       0.22 ±160%  sched_debug.rt_rq:/.rt_time.stddev
    296.75 ± 23%    +263.2%       1077 ± 67%  interrupts.32:PCI-MSI.524290-edge.eth0-TxRx-1
    296.75 ± 23%    +263.2%       1077 ± 67%  interrupts.CPU10.32:PCI-MSI.524290-edge.eth0-TxRx-1
    899.00 ±  7%     -10.5%     805.00        interrupts.CPU141.CAL:Function_call_interrupts
      1204 ± 36%     +83.6%       2211 ± 38%  interrupts.CPU170.CAL:Function_call_interrupts
      1324 ± 28%     -30.8%     916.00 ± 23%  interrupts.CPU2.CAL:Function_call_interrupts
      3042 ± 36%     -53.3%       1419 ± 27%  interrupts.CPU24.CAL:Function_call_interrupts
      1061 ± 24%     +83.7%       1950 ± 32%  interrupts.CPU72.CAL:Function_call_interrupts
     77.25 ±165%     -97.1%       2.25 ± 19%  interrupts.CPU93.TLB:TLB_shootdowns
    769.00 ± 23%     -36.9%     485.00 ± 11%  interrupts.TLB:TLB_shootdowns
     21833 ±  3%     +18.8%      25926 ±  7%  softirqs.CPU0.RCU
     20599 ±  4%     +13.5%      23371 ±  8%  softirqs.CPU107.RCU
     22896 ± 11%     +21.8%      27893 ±  5%  softirqs.CPU125.RCU
     21380 ±  6%     +18.5%      25341 ±  7%  softirqs.CPU163.RCU
     21890 ±  9%     +15.1%      25191 ±  6%  softirqs.CPU166.RCU
     20047 ±  5%     +17.0%      23453 ±  8%  softirqs.CPU176.RCU
     21786 ±  3%     +16.2%      25318 ±  8%  softirqs.CPU25.RCU
     23213 ±  4%     +14.6%      26602 ±  6%  softirqs.CPU35.RCU
     21272 ±  5%     +17.4%      24975 ±  8%  softirqs.CPU71.RCU
     20159 ±  4%     +16.1%      23400 ±  7%  softirqs.CPU76.RCU
 1.176e+11            +2.7%  1.208e+11        perf-stat.i.branch-instructions
      1.65            -0.1        1.51        perf-stat.i.branch-miss-rate%
 1.934e+09            -6.5%  1.808e+09        perf-stat.i.branch-misses
      1.26            -5.7%       1.19        perf-stat.i.cpi
      0.00 ±  5%      +0.0        0.00 ±  5%  perf-stat.i.dTLB-load-miss-rate%
    441221 ±  6%    +606.9%    3119036 ±  5%  perf-stat.i.dTLB-load-misses
   1.7e+11            +7.2%  1.823e+11        perf-stat.i.dTLB-loads
     16104 ±  2%      -4.2%      15432 ±  3%  perf-stat.i.dTLB-store-misses
 9.743e+10           +17.4%  1.144e+11        perf-stat.i.dTLB-stores
 2.243e+09           -24.3%  1.697e+09 ±  2%  perf-stat.i.iTLB-load-misses
  46888822            +9.4%   51286197        perf-stat.i.iTLB-loads
 5.555e+11            +5.5%  5.861e+11        perf-stat.i.instructions
    257.71           +37.7%     354.92        perf-stat.i.instructions-per-iTLB-miss
      0.80            +6.0%       0.84        perf-stat.i.ipc
      1.04            -5.9%       0.98 ±  3%  perf-stat.i.metric.K/sec
      2005            +8.4%       2174        perf-stat.i.metric.M/sec
      0.03            -5.8%       0.03 ±  2%  perf-stat.overall.MPKI
      1.64            -0.1        1.50        perf-stat.overall.branch-miss-rate%
      1.26            -5.7%       1.18        perf-stat.overall.cpi
      0.00 ±  8%      +0.0        0.00 ±  6%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  2%      -0.0        0.00 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
    247.68           +39.5%     345.39 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.80            +6.0%       0.84        perf-stat.overall.ipc
     87575           +12.6%      98570        perf-stat.overall.path-length
 1.172e+11            +2.7%  1.204e+11        perf-stat.ps.branch-instructions
 1.927e+09            -6.5%  1.802e+09        perf-stat.ps.branch-misses
    473764 ±  8%    +561.3%    3132830 ±  6%  perf-stat.ps.dTLB-load-misses
 1.694e+11            +7.2%  1.817e+11        perf-stat.ps.dTLB-loads
  9.71e+10           +17.4%   1.14e+11        perf-stat.ps.dTLB-stores
 2.235e+09           -24.3%  1.692e+09 ±  2%  perf-stat.ps.iTLB-load-misses
  46727836            +9.5%   51154009        perf-stat.ps.iTLB-loads
 5.537e+11            +5.5%  5.841e+11        perf-stat.ps.instructions
 1.678e+14            +5.3%  1.767e+14        perf-stat.total.instructions
     39.88            -3.8       36.04        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
     40.97            -2.5       38.46        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.lseek64
     22.10            -1.1       20.95        perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      8.75            -0.5        8.27        perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      7.52            -0.5        7.07        perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.44            -0.4        6.05        perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      6.69            -0.2        6.54        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.lseek64
      1.98            -0.1        1.88        perf-profile.calltrace.cycles-pp.generic_file_llseek_size.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
     98.92            +0.2       99.08        perf-profile.calltrace.cycles-pp.lseek64
      2.30            +0.6        2.94        perf-profile.calltrace.cycles-pp.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      0.00            +0.8        0.75        perf-profile.calltrace.cycles-pp.enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      0.00            +1.7        1.69        perf-profile.calltrace.cycles-pp.fpregs_assert_state_consistent.__prepare_exit_to_usermode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      0.00            +2.3        2.33        perf-profile.calltrace.cycles-pp.__syscall_return_slowpath.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
     47.50            +4.1       51.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.lseek64
      0.00            +4.9        4.85        perf-profile.calltrace.cycles-pp.__prepare_exit_to_usermode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
      0.00            +5.9        5.87        perf-profile.calltrace.cycles-pp.exit_to_user_mode.entry_SYSCALL_64_after_hwframe.lseek64
     40.57            -4.0       36.60        perf-profile.children.cycles-pp.do_syscall_64
     27.37            -1.6       25.76        perf-profile.children.cycles-pp.entry_SYSCALL_64
     22.66            -1.1       21.53        perf-profile.children.cycles-pp.ksys_lseek
     19.98            -0.9       19.10        perf-profile.children.cycles-pp.syscall_return_via_sysret
      9.42            -0.5        8.92        perf-profile.children.cycles-pp.__fdget_pos
      7.52            -0.5        7.07        perf-profile.children.cycles-pp.__fget_light
      6.46            -0.4        6.07        perf-profile.children.cycles-pp.shmem_file_llseek
      2.27            -0.1        2.17        perf-profile.children.cycles-pp.generic_file_llseek_size
      1.18            -0.1        1.11        perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.43 ±  2%      -0.0        0.41 ±  2%  perf-profile.children.cycles-pp.lseek@plt
      1.97            +0.1        2.02        perf-profile.children.cycles-pp.fpregs_assert_state_consistent
      2.43            +0.6        3.06        perf-profile.children.cycles-pp.__x64_sys_lseek
      0.00            +0.8        0.80        perf-profile.children.cycles-pp.enter_from_user_mode
      0.00            +2.3        2.33        perf-profile.children.cycles-pp.__syscall_return_slowpath
     47.97            +2.8       50.81        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.00            +5.1        5.06        perf-profile.children.cycles-pp.__prepare_exit_to_usermode
      0.00            +7.2        7.18        perf-profile.children.cycles-pp.exit_to_user_mode
     13.07            -9.5        3.55        perf-profile.self.cycles-pp.do_syscall_64
     18.01            -1.2       16.84        perf-profile.self.cycles-pp.lseek64
     19.84            -0.9       18.93        perf-profile.self.cycles-pp.syscall_return_via_sysret
     13.61            -0.7       12.89        perf-profile.self.cycles-pp.entry_SYSCALL_64
      7.03            -0.4        6.59        perf-profile.self.cycles-pp.__fget_light
      6.10            -0.4        5.73        perf-profile.self.cycles-pp.shmem_file_llseek
      7.57            -0.3        7.25        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.24            -0.1        2.14        perf-profile.self.cycles-pp.generic_file_llseek_size
      4.57            -0.1        4.47        perf-profile.self.cycles-pp.ksys_lseek
      2.12            -0.1        2.05        perf-profile.self.cycles-pp.__fdget_pos
      0.62            -0.0        0.59        perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.43 ±  2%      -0.0        0.40 ±  3%  perf-profile.self.cycles-pp.lseek@plt
      2.11            +0.6        2.75        perf-profile.self.cycles-pp.__x64_sys_lseek
      0.00            +0.7        0.68        perf-profile.self.cycles-pp.enter_from_user_mode
      0.00            +2.3        2.28        perf-profile.self.cycles-pp.__syscall_return_slowpath
      0.00            +3.1        3.09        perf-profile.self.cycles-pp.__prepare_exit_to_usermode
      0.00            +7.1        7.08        perf-profile.self.cycles-pp.exit_to_user_mode


                                                                                
                             will-it-scale.per_process_ops                      
                                                                                
    1e+07 +-----------------------------------------------------------------+   
          |  + .+.+     +.++.+.+.+.+.+     +  + .+.+.+  + .++     +.+       |   
  9.8e+06 |-+ +    +   +              :   +    +         +   +   +          |   
          |         +.+               : .+                    +.+           |   
          |                            +                                    |   
  9.6e+06 |-+                                                               |   
          |                                                                 |   
  9.4e+06 |-+                                                               |   
          |             O OO               O O     O O O O OO               |   
  9.2e+06 |-+                O O O O O                                      |   
          |   O                                                             |   
          | O   O O   O                        O O                          |   
    9e+06 |-+       O                                                       |   
          |                            O O                                  |   
  8.8e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  1.95e+09 +----------------------------------------------------------------+   
           |                                                                |   
           |.+. .+.+     .+.                 +. .+.+. .+. .+.+    +. .+.+.+.|   
   1.9e+09 |-+ +    +   +   +.+.+.+.+.+     +  +     +   +    +   : +       |   
           |         + +               +   +                   +.+          |   
           |          +                 +.+                                 |   
  1.85e+09 |-+                                                              |   
           |                                                                |   
   1.8e+09 |-+                                                              |   
           |            O O O               OO     O O O O O O              |   
           |                  O O O O O                                     |   
  1.75e+09 |-+ O O O                                                        |   
           | O       OO                        O O                          |   
           |                            O O                                 |   
   1.7e+09 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.7.0-14050-g8f159f1dfa1ea" of type "text/plain" (206218 bytes)

View attachment "job-script" of type "text/plain" (7721 bytes)

View attachment "job.yaml" of type "text/plain" (5323 bytes)

View attachment "reproduce" of type "text/plain" (339 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ