lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190718095037.GC27250@shao2-debian>
Date:   Thu, 18 Jul 2019 17:50:37 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Nadav Amit <namit@...are.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...capital.net>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>, Jessica Yu <jeyu@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [x86/modules]  f2c65fb322:  will-it-scale.per_process_ops -2.9%
 regression

Greeting,

FYI, we noticed a -2.9% regression of will-it-scale.per_process_ops due to commit:


commit: f2c65fb3221adc6b73b0549fc7ba892022db9797 ("x86/modules: Avoid breaking W^X while loading modules")
https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: poll1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/poll1/will-it-scale

commit: 
  7298e24f90 ("x86/kprobes: Set instruction page as executable")
  f2c65fb322 ("x86/modules: Avoid breaking W^X while loading modules")

7298e24f904224fa f2c65fb3221adc6b73b0549fc7b 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
         %stddev     %change         %stddev
             \          |                \  
   5017865            -2.9%    4874578        will-it-scale.per_process_ops
 9.634e+08            -2.9%  9.359e+08        will-it-scale.workload
     33.31            -1.0%      32.99        boot-time.dhcp
  10541056 ±  7%     +12.1%   11820032        meminfo.DirectMap2M
    652.58 ±  7%     +10.3%     719.62 ±  8%  sched_debug.cfs_rq:/.util_avg.min
     19388 ± 14%     -17.5%      16003 ±  6%  softirqs.CPU134.RCU
   1140906            -2.7%    1109655 ±  2%  proc-vmstat.pgalloc_normal
   1069002 ±  2%      -4.7%    1018671 ±  3%  proc-vmstat.pgfree
      2124 ± 11%     -21.8%       1662 ±  4%  slabinfo.pool_workqueue.active_objs
      2220 ± 11%     -21.6%       1742 ±  5%  slabinfo.pool_workqueue.num_objs
      8708 ±  3%     +11.6%       9714 ±  6%  numa-vmstat.node0.nr_kernel_stack
      6771 ±  7%     -12.5%       5924 ±  2%  numa-vmstat.node3.nr_kernel_stack
      5180 ± 19%     -31.7%       3540 ± 13%  numa-vmstat.node3.nr_slab_reclaimable
      8710 ±  3%     +11.5%       9714 ±  6%  numa-meminfo.node0.KernelStack
     20724 ± 19%     -31.7%      14162 ± 13%  numa-meminfo.node3.KReclaimable
      6773 ±  7%     -12.5%       5923 ±  2%  numa-meminfo.node3.KernelStack
     20724 ± 19%     -31.7%      14162 ± 13%  numa-meminfo.node3.SReclaimable
     82391 ±  9%     -12.4%      72177 ±  8%  numa-meminfo.node3.Slab
    171.00 ±103%     -87.6%      21.25 ± 70%  interrupts.CPU110.RES:Rescheduling_interrupts
      8457           -37.6%       5278 ± 34%  interrupts.CPU145.NMI:Non-maskable_interrupts
      8457           -37.6%       5278 ± 34%  interrupts.CPU145.PMI:Performance_monitoring_interrupts
      8453           -37.5%       5285 ± 34%  interrupts.CPU146.NMI:Non-maskable_interrupts
      8453           -37.5%       5285 ± 34%  interrupts.CPU146.PMI:Performance_monitoring_interrupts
      8458           -37.5%       5284 ± 34%  interrupts.CPU147.NMI:Non-maskable_interrupts
      8458           -37.5%       5284 ± 34%  interrupts.CPU147.PMI:Performance_monitoring_interrupts
      8456           -37.6%       5279 ± 34%  interrupts.CPU149.NMI:Non-maskable_interrupts
      8456           -37.6%       5279 ± 34%  interrupts.CPU149.PMI:Performance_monitoring_interrupts
      8449           -37.5%       5278 ± 34%  interrupts.CPU151.NMI:Non-maskable_interrupts
      8449           -37.5%       5278 ± 34%  interrupts.CPU151.PMI:Performance_monitoring_interrupts
     14.75 ± 41%   +2237.3%     344.75 ± 93%  interrupts.CPU154.RES:Rescheduling_interrupts
    108.75 ±112%     -85.7%      15.50 ± 41%  interrupts.CPU168.RES:Rescheduling_interrupts
    168.00 ± 19%     -41.8%      97.75 ± 40%  interrupts.CPU18.RES:Rescheduling_interrupts
      1882 ± 87%     -94.9%      95.50 ± 50%  interrupts.CPU19.RES:Rescheduling_interrupts
    218.75 ± 88%    +196.9%     649.50 ± 69%  interrupts.CPU22.RES:Rescheduling_interrupts
      8448           -37.4%       5291 ± 34%  interrupts.CPU31.NMI:Non-maskable_interrupts
      8448           -37.4%       5291 ± 34%  interrupts.CPU31.PMI:Performance_monitoring_interrupts
     79.50 ± 23%    +537.4%     506.75 ± 75%  interrupts.CPU33.RES:Rescheduling_interrupts
      8436           -37.5%       5274 ± 34%  interrupts.CPU34.NMI:Non-maskable_interrupts
      8436           -37.5%       5274 ± 34%  interrupts.CPU34.PMI:Performance_monitoring_interrupts
    195.50 ±127%    +423.1%       1022 ± 47%  interrupts.CPU36.RES:Rescheduling_interrupts
     43.00 ± 46%   +3592.4%       1587 ±150%  interrupts.CPU37.RES:Rescheduling_interrupts
     82.00 ± 28%    +103.4%     166.75 ± 52%  interrupts.CPU41.RES:Rescheduling_interrupts
    102.75 ± 29%     -53.3%      48.00 ± 45%  interrupts.CPU50.RES:Rescheduling_interrupts
    474.00 ± 30%     -73.0%     127.75 ± 75%  interrupts.CPU68.RES:Rescheduling_interrupts
      8042 ±  4%      +6.5%       8565 ±  4%  interrupts.CPU95.RES:Rescheduling_interrupts
      0.02            -6.0%       0.02 ±  5%  perf-stat.i.MPKI
 1.111e+11            -2.8%  1.079e+11        perf-stat.i.branch-instructions
 2.454e+08            -3.4%  2.371e+08        perf-stat.i.branch-misses
    726252 ±  3%      -7.8%     669916 ±  3%  perf-stat.i.cache-misses
  13163065            -8.5%   12044270 ±  5%  perf-stat.i.cache-references
      1.02            +3.0%       1.05        perf-stat.i.cpi
    835228            +7.0%     893984 ±  4%  perf-stat.i.cycles-between-cache-misses
 1.419e+11            -2.8%   1.38e+11        perf-stat.i.dTLB-loads
      0.00            -0.0        0.00 ±  3%  perf-stat.i.dTLB-store-miss-rate%
     13827            -8.8%      12606 ±  3%  perf-stat.i.dTLB-store-misses
 1.023e+11            -2.8%  9.937e+10        perf-stat.i.dTLB-stores
 2.909e+08           -12.0%   2.56e+08 ±  2%  perf-stat.i.iTLB-load-misses
 5.448e+11            -2.8%  5.293e+11        perf-stat.i.instructions
      1876           +10.4%       2071 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      0.98            -2.9%       0.95        perf-stat.i.ipc
      3297            -1.7%       3241        perf-stat.i.minor-faults
     31602 ± 11%     -21.5%      24805 ± 15%  perf-stat.i.node-loads
      3297            -1.7%       3241        perf-stat.i.page-faults
      0.02            -5.8%       0.02 ±  5%  perf-stat.overall.MPKI
      1.02            +3.0%       1.05        perf-stat.overall.cpi
    766162 ±  3%      +8.5%     831132 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.00            -0.0        0.00 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
      1872           +10.4%       2068 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.98            -2.9%       0.95        perf-stat.overall.ipc
     85.08            +3.2       88.25 ±  2%  perf-stat.overall.node-load-miss-rate%
 1.107e+11            -2.8%  1.075e+11        perf-stat.ps.branch-instructions
 2.446e+08            -3.4%  2.363e+08        perf-stat.ps.branch-misses
    723856 ±  3%      -7.8%     667682 ±  3%  perf-stat.ps.cache-misses
  13118781            -8.5%   12003520 ±  5%  perf-stat.ps.cache-references
 1.414e+11            -2.8%  1.375e+11        perf-stat.ps.dTLB-loads
     13781            -8.8%      12564 ±  3%  perf-stat.ps.dTLB-store-misses
 1.019e+11            -2.8%  9.903e+10        perf-stat.ps.dTLB-stores
 2.899e+08           -12.0%  2.551e+08 ±  2%  perf-stat.ps.iTLB-load-misses
 5.429e+11            -2.8%  5.275e+11        perf-stat.ps.instructions
      3286            -1.7%       3230        perf-stat.ps.minor-faults
     31502 ± 11%     -21.5%      24724 ± 15%  perf-stat.ps.node-loads
      3286            -1.7%       3230        perf-stat.ps.page-faults
 1.623e+14            -3.3%   1.57e+14        perf-stat.total.instructions
     25.77            -0.7       25.07        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.poll
      3.34            -0.3        3.08        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
     18.82            -0.2       18.60        perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     98.24            -0.2       98.08        perf-profile.calltrace.cycles-pp.poll
      4.33            -0.2        4.17        perf-profile.calltrace.cycles-pp.copy_user_generic_unrolled._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      4.96            -0.1        4.86        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.poll
      3.32            -0.1        3.21        perf-profile.calltrace.cycles-pp.___might_sleep.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll
      1.63            +0.2        1.79        perf-profile.calltrace.cycles-pp.testcase
      6.54            +0.2        6.72        perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      1.85            +0.2        2.05        perf-profile.calltrace.cycles-pp.__might_sleep.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll
      1.41            +0.3        1.74 ±  2%  perf-profile.calltrace.cycles-pp.check_stack_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
      3.44            +0.5        3.91        perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.12            +0.6        1.68        perf-profile.calltrace.cycles-pp.poll_freewait.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     64.34            +0.7       65.03        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.poll
     60.51            +0.7       61.25        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
     49.87            +0.8       50.68        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
     56.87            +1.0       57.84        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
     17.21            -0.4       16.78        perf-profile.children.cycles-pp.entry_SYSCALL_64
     13.72            -0.4       13.32        perf-profile.children.cycles-pp.syscall_return_via_sysret
      3.67            -0.3        3.38        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      5.37            -0.3        5.10        perf-profile.children.cycles-pp.copy_user_generic_unrolled
     18.32            -0.1       18.20        perf-profile.children.cycles-pp._copy_from_user
      3.53            -0.1        3.42        perf-profile.children.cycles-pp.___might_sleep
     98.40            -0.1       98.31        perf-profile.children.cycles-pp.poll
      1.82            +0.1        1.96        perf-profile.children.cycles-pp.testcase
      6.96            +0.2        7.15        perf-profile.children.cycles-pp.__might_fault
      1.96            +0.2        2.17        perf-profile.children.cycles-pp.__might_sleep
      1.55            +0.3        1.82        perf-profile.children.cycles-pp.check_stack_object
      3.79            +0.6        4.40        perf-profile.children.cycles-pp.__check_object_size
      1.13            +0.6        1.76        perf-profile.children.cycles-pp.poll_freewait
     64.46            +0.7       65.13        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     60.89            +0.7       61.62        perf-profile.children.cycles-pp.do_syscall_64
     56.65            +0.9       57.52        perf-profile.children.cycles-pp.__x64_sys_poll
     51.27            +1.1       52.38        perf-profile.children.cycles-pp.do_sys_poll
     13.70            -0.4       13.30        perf-profile.self.cycles-pp.syscall_return_via_sysret
      4.78            -0.3        4.45        perf-profile.self.cycles-pp.__x64_sys_poll
      3.40            -0.3        3.15        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      5.01            -0.2        4.78        perf-profile.self.cycles-pp.copy_user_generic_unrolled
     11.45            -0.2       11.25        perf-profile.self.cycles-pp.poll
      8.55            -0.2        8.39        perf-profile.self.cycles-pp.entry_SYSCALL_64
      3.96            -0.1        3.85        perf-profile.self.cycles-pp.do_syscall_64
      3.40            -0.1        3.28        perf-profile.self.cycles-pp.___might_sleep
      3.53            -0.1        3.47        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.73            -0.0        0.71        perf-profile.self.cycles-pp.poll_select_set_timeout
      2.78            +0.1        2.84        perf-profile.self.cycles-pp._copy_from_user
      1.58            +0.1        1.66        perf-profile.self.cycles-pp.__might_fault
      1.56            +0.1        1.65        perf-profile.self.cycles-pp.testcase
      1.41            +0.2        1.57        perf-profile.self.cycles-pp.check_stack_object
      1.73            +0.2        1.94        perf-profile.self.cycles-pp.__might_sleep
      2.30            +0.3        2.61        perf-profile.self.cycles-pp.__check_object_size
     23.36            +0.3       23.70        perf-profile.self.cycles-pp.do_sys_poll
      0.92            +0.6        1.53        perf-profile.self.cycles-pp.poll_freewait


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  6e+06 +-+-----------------------------------------------------------------+   
        |                                                                   |   
  5e+06 +-++...+         +..+...+..+...+..+..+...+..+...+..+..+...+..+...+..|   
        O  O   O         O  O   O  O   O  O  O   O  O   O  O  O   O  O   O  O   
        |       :        :                                                  |   
  4e+06 +-+     :       :                                                   |   
        |       :       :                                                   |   
  3e+06 +-+     :       :                                                   |   
        |        :      :                                                   |   
  2e+06 +-+      :     :                                                    |   
        |        :     :                                                    |   
        |        :     :                                                    |   
  1e+06 +-+       :    :                                                    |   
        |         :   :                                                     |   
      0 +-+-------O---O-----------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               will-it-scale.workload                           
                                                                                
  1e+09 +-+-----------------------------------------------------------------+   
  9e+08 O-+O...O         O..O.  O  O   O..O..O...O..O...O..O..O...O. O   O..O   
        |      :         :                                                  |   
  8e+08 +-+    :         :                                                  |   
  7e+08 +-+     :       :                                                   |   
        |       :       :                                                   |   
  6e+08 +-+     :       :                                                   |   
  5e+08 +-+     :       :                                                   |   
  4e+08 +-+      :     :                                                    |   
        |        :     :                                                    |   
  3e+08 +-+      :     :                                                    |   
  2e+08 +-+      :     :                                                    |   
        |         :   :                                                     |   
  1e+08 +-+       :   :                                                     |   
      0 +-+-------O---O-----------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.1.0-rc7-00027-gf2c65fb3221adc" of type "text/plain" (189140 bytes)

View attachment "job-script" of type "text/plain" (7329 bytes)

View attachment "job.yaml" of type "text/plain" (4893 bytes)

View attachment "reproduce" of type "text/plain" (310 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ