lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Mon, 18 Feb 2019 15:54:42 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Wei Yang <richardw.yang@...ux.intel.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: [LKP] [driver core]  570d020012: will-it-scale.per_thread_ops -12.2%
 regression

Greeting,

FYI, we noticed a -12.2% regression of will-it-scale.per_thread_ops due to commit:


commit: 570d0200123fb4f809aa2f6226e93a458d664d70 ("driver core: move device->knode_class to device_private")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: will-it-scale
on test machine: 288 threads Knights Mill with 80G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: unlink2
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -29.9% regression |
| test machine     | 288 threads Knights Mill with 80G memory                      |
| test parameters  | cpufreq_governor=performance                                  |
|                  | mode=thread                                                   |
|                  | nr_task=100%                                                  |
|                  | test=signal1                                                  |
+------------------+---------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -16.5% regression |
| test machine     | 288 threads Knights Mill with 80G memory                      |
| test parameters  | cpufreq_governor=performance                                  |
|                  | mode=thread                                                   |
|                  | nr_task=100%                                                  |
|                  | test=open1                                                    |
+------------------+---------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-knm01/unlink2/will-it-scale

commit: 
  4bd4e92cfe ("sysfs: fix blank line coding style warning")
  570d020012 ("driver core: move device->knode_class to device_private")

4bd4e92cfe6d2af7 570d0200123fb4f809aa2f6226 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    937.00           -12.2%     823.00        will-it-scale.per_thread_ops
    906625            -1.7%     891438        will-it-scale.time.involuntary_context_switches
    269989           -12.2%     237096        will-it-scale.workload
  17104244           -12.3%   15001702        softirqs.RCU
      6886            -1.7%       6768        vmstat.system.cs
 2.088e+08 ± 17%     -56.9%   90059912 ± 57%  cpuidle.C1.time
    683051 ± 15%     -42.5%     392628 ± 40%  cpuidle.C1.usage
 4.519e+08 ±  6%     +27.8%  5.777e+08 ±  9%  cpuidle.C6.time
    648490 ± 14%     -42.7%     371724 ± 43%  turbostat.C1
      0.23 ± 18%      -0.1        0.10 ± 62%  turbostat.C1%
      0.42 ±  9%      +0.1        0.55 ± 11%  turbostat.C6%
      0.41 ±  5%     -18.1%       0.34 ± 16%  turbostat.CPU%c1
      0.30 ±  6%     +27.9%       0.39 ± 13%  turbostat.CPU%c6
      5.22 ± 13%      -1.9        3.35 ±  4%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.27 ± 13%      -1.9        3.40 ±  3%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_open
      4.71 ± 14%      -1.8        2.92 ±  4%  perf-profile.calltrace.cycles-pp.unlink
      4.58 ± 15%      -1.8        2.81 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
      4.56 ± 15%      -1.8        2.80 ±  4%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      4.40 ± 15%      -1.8        2.64 ±  5%  perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      3.18 ± 23%      -1.6        1.53 ±  9%  perf-profile.calltrace.cycles-pp.shmem_mknod.path_openat.do_filp_open.do_sys_open.do_syscall_64
      1.70 ± 19%      -0.7        0.95 ± 14%  perf-profile.calltrace.cycles-pp.shmem_get_inode.shmem_mknod.path_openat.do_filp_open.do_sys_open
      1.47 ± 20%      -0.7        0.76 ± 18%  perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      1.43 ± 23%      -0.7        0.73 ± 18%  perf-profile.calltrace.cycles-pp.new_inode.shmem_get_inode.shmem_mknod.path_openat.do_filp_open
      1.57 ±  5%      -0.5        1.04 ± 57%  perf-profile.calltrace.cycles-pp.perf_mux_hrtimer_handler.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      0.58            -0.1        0.52 ±  2%  perf-profile.calltrace.cycles-pp.filename_parentat.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.31 ± 16%      +0.7        3.00 ±  9%  perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      2.04 ± 17%      +0.7        2.76 ±  7%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
      2.09 ± 17%      +0.7        2.82 ±  7%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
     44.55            +1.7       46.29        perf-profile.calltrace.cycles-pp.__GI___libc_close
     44.26            +1.8       46.07        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__GI___libc_close
     44.24            +1.8       46.05        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_close
     43.67            +1.8       45.49        perf-profile.calltrace.cycles-pp.__close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_close
     43.80            +1.9       45.67        perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_close
     43.67            +1.9       45.61        perf-profile.calltrace.cycles-pp.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_open
     43.11            +2.0       45.13        perf-profile.calltrace.cycles-pp._raw_spin_lock.__close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
     43.17            +2.0       45.22        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__close_fd.__x64_sys_close.do_syscall_64
     43.08            +2.1       45.17        perf-profile.calltrace.cycles-pp._raw_spin_lock.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     43.13            +2.1       45.26        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__alloc_fd.do_sys_open.do_syscall_64
     12.29           -10.0%      11.06 ±  2%  perf-stat.i.MPKI
 8.564e+09            -1.7%  8.421e+09        perf-stat.i.branch-instructions
      1.59            -0.1        1.46 ±  2%  perf-stat.i.branch-miss-rate%
 1.356e+08            -9.7%  1.225e+08        perf-stat.i.branch-misses
     16.95            +1.1       18.02        perf-stat.i.cache-miss-rate%
  74122780            -6.2%   69544118        perf-stat.i.cache-misses
 4.377e+08           -11.8%  3.862e+08        perf-stat.i.cache-references
      6523            -1.6%       6416        perf-stat.i.context-switches
     12.02            +1.6%      12.21        perf-stat.i.cpi
      5778            +6.2%       6134        perf-stat.i.cycles-between-cache-misses
      0.33            -0.0        0.29 ±  2%  perf-stat.i.iTLB-load-miss-rate%
 1.185e+08           -15.6%      1e+08 ±  2%  perf-stat.i.iTLB-load-misses
 3.555e+10            -2.0%  3.484e+10        perf-stat.i.iTLB-loads
 3.563e+10            -2.0%  3.492e+10        perf-stat.i.instructions
    300.74           +16.2%     349.51 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      0.08            -1.6%       0.08        perf-stat.i.ipc
     12.28           -10.0%      11.06 ±  2%  perf-stat.overall.MPKI
      1.58            -0.1        1.45 ±  2%  perf-stat.overall.branch-miss-rate%
     16.94            +1.1       18.01        perf-stat.overall.cache-miss-rate%
     12.02            +1.6%      12.21        perf-stat.overall.cpi
      5776            +6.2%       6132        perf-stat.overall.cycles-between-cache-misses
      0.33            -0.0        0.29 ±  2%  perf-stat.overall.iTLB-load-miss-rate%
    300.68           +16.2%     349.34 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.08            -1.6%       0.08        perf-stat.overall.ipc
  39542686           +11.7%   44150058        perf-stat.overall.path-length
  8.53e+09            -1.7%  8.388e+09        perf-stat.ps.branch-instructions
 1.351e+08            -9.7%   1.22e+08        perf-stat.ps.branch-misses
  73835513            -6.2%   69273046        perf-stat.ps.cache-misses
  4.36e+08           -11.8%  3.847e+08        perf-stat.ps.cache-references
      6498            -1.7%       6390        perf-stat.ps.context-switches
  1.18e+08           -15.6%   99644821 ±  2%  perf-stat.ps.iTLB-load-misses
 3.541e+10            -2.0%   3.47e+10        perf-stat.ps.iTLB-loads
 3.549e+10            -2.0%  3.479e+10        perf-stat.ps.instructions
 1.067e+13            -1.9%  1.047e+13        perf-stat.total.instructions
    595740 ± 18%     -43.0%     339338 ± 28%  sched_debug.cfs_rq:/.MIN_vruntime.avg
   4833300 ±  8%     -31.2%    3327465 ± 24%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
     15804 ±  7%     -28.3%      11329 ± 22%  sched_debug.cfs_rq:/.load.avg
   1096777           -24.1%     832275 ± 27%  sched_debug.cfs_rq:/.load.max
    108529 ±  4%     -27.8%      78378 ± 26%  sched_debug.cfs_rq:/.load.stddev
     12.36 ±  3%     -11.3%      10.96        sched_debug.cfs_rq:/.load_avg.avg
    522.40 ±  8%     -32.1%     354.80 ± 23%  sched_debug.cfs_rq:/.load_avg.max
     35.08 ±  7%     -28.4%      25.13 ± 21%  sched_debug.cfs_rq:/.load_avg.stddev
    595740 ± 18%     -43.0%     339338 ± 28%  sched_debug.cfs_rq:/.max_vruntime.avg
   4833300 ±  8%     -31.2%    3327465 ± 24%  sched_debug.cfs_rq:/.max_vruntime.stddev
      0.12 ±  9%     -21.9%       0.10 ± 18%  sched_debug.cfs_rq:/.nr_running.stddev
     77.95 ±  6%     +37.6%     107.25 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.max
      7.10 ±  2%     +20.5%       8.55 ± 15%  sched_debug.cfs_rq:/.nr_spread_over.stddev
      4.39           -15.0%       3.73 ±  8%  sched_debug.cfs_rq:/.runnable_load_avg.avg
    468.00           -35.1%     303.60 ± 30%  sched_debug.cfs_rq:/.runnable_load_avg.max
     28.56           -35.1%      18.54 ± 29%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
     15797 ±  7%     -28.3%      11323 ± 22%  sched_debug.cfs_rq:/.runnable_weight.avg
   1096533           -24.1%     832050 ± 27%  sched_debug.cfs_rq:/.runnable_weight.max
    108522 ±  4%     -27.8%      78365 ± 26%  sched_debug.cfs_rq:/.runnable_weight.stddev
    772.59 ±  6%     -28.1%     555.30 ±  5%  sched_debug.cpu.clock.stddev
    772.58 ±  6%     -28.1%     555.30 ±  5%  sched_debug.cpu.clock_task.stddev
      4.43           -15.1%       3.76 ±  8%  sched_debug.cpu.cpu_load[0].avg
     28.58           -33.9%      18.88 ± 27%  sched_debug.cpu.cpu_load[0].stddev
      4.51 ±  3%     -16.6%       3.76 ±  8%  sched_debug.cpu.cpu_load[1].avg
     29.07 ±  6%     -35.8%      18.66 ± 28%  sched_debug.cpu.cpu_load[1].stddev
      4.45 ±  2%     -15.7%       3.75 ±  8%  sched_debug.cpu.cpu_load[2].avg
     28.36 ±  4%     -34.7%      18.53 ± 28%  sched_debug.cpu.cpu_load[2].stddev
      4.40           -15.0%       3.73 ±  8%  sched_debug.cpu.cpu_load[3].avg
     27.95 ±  3%     -34.1%      18.43 ± 28%  sched_debug.cpu.cpu_load[3].stddev
      4.34 ±  2%     -14.3%       3.72 ±  8%  sched_debug.cpu.cpu_load[4].avg
    456.00 ±  3%     -34.2%     300.10 ± 29%  sched_debug.cpu.cpu_load[4].max
     27.74 ±  2%     -33.6%      18.41 ± 28%  sched_debug.cpu.cpu_load[4].stddev
     15602 ±  7%     -20.4%      12420 ± 19%  sched_debug.cpu.load.avg
    106214 ±  3%     -19.2%      85804 ± 18%  sched_debug.cpu.load.stddev
      0.00 ±  4%     -24.5%       0.00 ±  3%  sched_debug.cpu.next_balance.stddev
      0.50 ± 26%     -54.8%       0.23 ± 79%  sched_debug.rt_rq:/.rt_time.avg
    143.90 ± 26%     -54.8%      65.01 ± 79%  sched_debug.rt_rq:/.rt_time.max
      8.46 ± 26%     -54.8%       3.82 ± 79%  sched_debug.rt_rq:/.rt_time.stddev


                                                                                
                           will-it-scale.per_thread_ops                         
                                                                                
  1000 +-+------------------------------------------------------------------+   
       |                                                                    |   
       |..  +..  .+    +..            .+.. .+          +                    |   
   950 +-+ +   +.  + ..    .+..+.. .+.    +  +   +.. ..                     |   
       |  +         +     +       +           + +   +                       |   
       |                                       +                            |   
   900 +-+                                                                  |   
       |                                    O    O                          |   
   850 +-+          O          O    O                          O            |   
       O                  O       O    O  O    O            O    O       O  |   
       |                    O                       O  O O          O  O    O   
   800 +-+O       O    O                                                    |   
       |    O                                                               |   
       |       O                                                            |   
   750 +-+------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-knm01: 288 threads Knights Mill with 80G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-knm01/signal1/will-it-scale

commit: 
  4bd4e92cfe ("sysfs: fix blank line coding style warning")
  570d020012 ("driver core: move device->knode_class to device_private")

4bd4e92cfe6d2af7 570d0200123fb4f809aa2f6226 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
         %stddev     %change         %stddev
             \          |                \  
    266.75 ±  2%     -29.9%     187.00        will-it-scale.per_thread_ops
    131.40 ±  6%     -29.3%      92.91 ±  3%  will-it-scale.time.user_time
     76931 ±  2%     -29.9%      53963        will-it-scale.workload
    788152 ±  8%     +17.8%     928405 ±  5%  cpuidle.C6.usage
     33255 ±  4%      -8.2%      30514 ±  2%  numa-meminfo.node1.Slab
  34856762           +14.5%   39908744        softirqs.TIMER
      0.00 ± 26%      +0.0        0.01 ± 67%  mpstat.cpu.soft%
      0.28 ±  3%      -0.0        0.23 ±  2%  mpstat.cpu.usr%
    981.00 ±  7%     -12.8%     855.00 ±  2%  slabinfo.skbuff_ext_cache.active_objs
    981.00 ±  7%     -12.8%     855.00 ±  2%  slabinfo.skbuff_ext_cache.num_objs
    714927 ± 11%     +21.5%     868290 ±  5%  turbostat.C6
      1.71 ±  5%     +39.9%       2.39        turbostat.RAMWatt
      1121 ±  2%     -23.8%     854.06 ±  4%  sched_debug.cfs_rq:/.exec_clock.stddev
     20.90 ± 11%     +42.0%      29.68 ± 17%  sched_debug.cfs_rq:/.nr_spread_over.avg
     54.08 ±  4%     +18.2%      63.93 ±  7%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    349403 ± 73%     +94.9%     680874 ±  2%  sched_debug.cpu.avg_idle.min
    860.45 ±  4%     -37.1%     540.88 ±  7%  sched_debug.cpu.clock.stddev
    860.45 ±  4%     -37.1%     540.88 ±  7%  sched_debug.cpu.clock_task.stddev
      5355 ± 14%     +21.7%       6517        sched_debug.cpu.curr->pid.max
      0.00 ±  3%     -36.6%       0.00 ±  6%  sched_debug.cpu.next_balance.stddev
     16.16            -0.2       16.00        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__set_current_blocked.signal_setup_done.do_signal.exit_to_usermode_loop
     16.14            -0.2       15.99        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.__set_current_blocked.signal_setup_done.do_signal
     32.16            -0.1       32.04        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.__set_current_blocked.sigprocmask.__x64_sys_rt_sigprocmask
     16.12            -0.1       16.01        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.get_signal.do_signal.exit_to_usermode_loop
     16.13            -0.1       16.01        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64
     32.17            -0.1       32.05        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__set_current_blocked.sigprocmask.__x64_sys_rt_sigprocmask.do_syscall_64
     16.11            -0.1       16.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64
     16.12            -0.1       16.01        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
     16.63            -0.1       16.54        perf-profile.calltrace.cycles-pp.handler
     16.49            -0.1       16.40        perf-profile.calltrace.cycles-pp.__set_current_blocked.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64
     16.57            -0.1       16.48        perf-profile.calltrace.cycles-pp.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.handler
     16.58            -0.1       16.49        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.handler
     16.58            -0.1       16.49        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.handler
     16.57            -0.1       16.49        perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.handler
     16.52            -0.1       16.44        perf-profile.calltrace.cycles-pp.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     16.63            -0.1       16.56        perf-profile.calltrace.cycles-pp.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.raise
     16.65            -0.1       16.58        perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.raise
     16.52            -0.1       16.46        perf-profile.calltrace.cycles-pp.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.59 ±  4%      +0.1        0.71 ±  2%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.__set_current_blocked.sigprocmask.__x64_sys_rt_sigprocmask
      0.64 ±  5%      +0.1        0.77 ±  2%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.__set_current_blocked.sigprocmask.__x64_sys_rt_sigprocmask.do_syscall_64
      0.27 ±100%      +0.3        0.59 ±  2%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.__set_current_blocked.sigprocmask
      4.55 ±  2%      -8.7%       4.15        perf-stat.i.MPKI
      0.52            -0.1        0.47        perf-stat.i.branch-miss-rate%
  47148684            -8.9%   42963996        perf-stat.i.branch-misses
  35569495 ±  2%      -4.9%   33832492        perf-stat.i.cache-misses
  1.66e+08            -7.7%  1.533e+08        perf-stat.i.cache-references
     12.38            -1.2%      12.24        perf-stat.i.cpi
     12726 ±  2%      +5.1%      13374        perf-stat.i.cycles-between-cache-misses
      0.09 ±  4%      -0.0        0.07        perf-stat.i.iTLB-load-miss-rate%
  32470934 ±  5%     -14.8%   27651795        perf-stat.i.iTLB-load-misses
      1128 ±  4%     +18.6%       1339        perf-stat.i.instructions-per-iTLB-miss
      4.54 ±  2%      -8.7%       4.15        perf-stat.overall.MPKI
      0.52            -0.1        0.47 ±  2%  perf-stat.overall.branch-miss-rate%
     12.38            -1.2%      12.24        perf-stat.overall.cpi
     12722 ±  2%      +5.1%      13367        perf-stat.overall.cycles-between-cache-misses
      0.09 ±  4%      -0.0        0.07        perf-stat.overall.iTLB-load-miss-rate%
      1127 ±  4%     +18.5%       1336        perf-stat.overall.instructions-per-iTLB-miss
 1.414e+08 ±  2%     +43.9%  2.035e+08 ±  2%  perf-stat.overall.path-length
  46584988            -9.3%   42262634        perf-stat.ps.branch-misses
  35172756 ±  2%      -5.0%   33400721        perf-stat.ps.cache-misses
 1.641e+08            -7.8%  1.513e+08        perf-stat.ps.cache-references
  32116706 ±  5%     -15.0%   27304782        perf-stat.ps.iTLB-load-misses



***************************************************************************************************
lkp-knm01: 288 threads Knights Mill with 80G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-knm01/open1/will-it-scale

commit: 
  4bd4e92cfe ("sysfs: fix blank line coding style warning")
  570d020012 ("driver core: move device->knode_class to device_private")

4bd4e92cfe6d2af7 570d0200123fb4f809aa2f6226 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    945.00           -16.5%     788.75        will-it-scale.per_thread_ops
    568697 ±  7%     -21.3%     447754        will-it-scale.time.involuntary_context_switches
    303.55 ±  2%      -7.2%     281.60 ±  2%  will-it-scale.time.user_time
    272400           -16.6%     227268        will-it-scale.workload
      0.08 ±  9%      -0.0        0.05 ±  5%  mpstat.cpu.soft%
    192099           -11.9%     169161 ±  3%  slabinfo.filp.active_objs
   5043716           -12.1%    4431600 ±  2%  softirqs.RCU
      2.31 ±  2%     +11.2%       2.58        turbostat.RAMWatt
      4620 ±  6%     -19.2%       3732        vmstat.system.cs
      5458            +1.4%       5532        proc-vmstat.nr_inactive_anon
      5458            +1.4%       5532        proc-vmstat.nr_zone_inactive_anon
      7.27 ±  4%      -0.8        6.50 ±  4%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock
      6.02 ±  5%      -0.6        5.40 ±  5%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath
      1.22 ±  2%      -0.2        1.07 ±  2%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_open
      1.16 ±  2%      -0.1        1.02 ±  2%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     50.11            -0.1       49.98        perf-profile.calltrace.cycles-pp.__GI___libc_open
     47.70            +0.2       47.86        perf-profile.calltrace.cycles-pp._raw_spin_lock.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     47.89            +0.2       48.12        perf-profile.calltrace.cycles-pp._raw_spin_lock.__close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
    306780 ± 31%     -46.2%     165114 ± 27%  sched_debug.cfs_rq:/.MIN_vruntime.avg
  37943697 ± 12%     -23.6%   28986155 ± 20%  sched_debug.cfs_rq:/.MIN_vruntime.max
   3199728 ± 21%     -33.9%    2114259 ± 22%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
      1368 ±  7%     -17.0%       1135 ±  9%  sched_debug.cfs_rq:/.exec_clock.stddev
     10913 ± 25%     -29.9%       7650 ± 10%  sched_debug.cfs_rq:/.load.avg
     76190 ± 22%     -21.5%      59825 ±  7%  sched_debug.cfs_rq:/.load.stddev
    306780 ± 31%     -46.2%     165114 ± 27%  sched_debug.cfs_rq:/.max_vruntime.avg
  37943697 ± 12%     -23.6%   28986155 ± 20%  sched_debug.cfs_rq:/.max_vruntime.max
   3199728 ± 21%     -33.9%    2114259 ± 22%  sched_debug.cfs_rq:/.max_vruntime.stddev
     10906 ± 25%     -29.9%       7642 ± 10%  sched_debug.cfs_rq:/.runnable_weight.avg
     76186 ± 22%     -21.5%      59800 ±  7%  sched_debug.cfs_rq:/.runnable_weight.stddev
    234073 ± 39%    +134.5%     548907 ± 71%  sched_debug.cfs_rq:/.spread0.avg
      1117 ± 11%     -26.5%     821.87 ±  5%  sched_debug.cpu.clock.stddev
      1117 ± 11%     -26.5%     821.87 ±  5%  sched_debug.cpu.clock_task.stddev
      0.00 ±  8%     -25.6%       0.00 ±  6%  sched_debug.cpu.next_balance.stddev
      2333 ±  3%     -14.3%       1998        sched_debug.cpu.nr_switches.avg
      1323 ±  4%     -20.1%       1057        sched_debug.cpu.nr_switches.min
      2160 ±  4%     -12.5%       1891 ±  2%  sched_debug.cpu.sched_count.avg
      1179 ±  5%     -17.0%     978.15        sched_debug.cpu.sched_count.min
      1.55 ± 66%    -100.0%       0.00        sched_debug.cpu.sched_goidle.min
    912.99 ±  4%     -18.1%     747.65        sched_debug.cpu.ttwu_count.avg
    590.45 ±  5%     -21.5%     463.60 ±  2%  sched_debug.cpu.ttwu_count.min
    835.45 ±  5%     -19.8%     670.09        sched_debug.cpu.ttwu_local.avg
    561.95 ±  5%     -22.5%     435.70 ±  3%  sched_debug.cpu.ttwu_local.min
      6.47            -8.2%       5.93 ±  3%  perf-stat.i.MPKI
 8.137e+09            +1.8%  8.282e+09        perf-stat.i.branch-instructions
      0.85            -0.1        0.77 ±  2%  perf-stat.i.branch-miss-rate%
  69084658            -7.5%   63892585 ±  2%  perf-stat.i.branch-misses
  44409176            -6.8%   41391448 ±  3%  perf-stat.i.cache-misses
 2.151e+08            -6.9%  2.003e+08 ±  3%  perf-stat.i.cache-references
      4250 ±  7%     -19.1%       3436        perf-stat.i.context-switches
     13.05            -1.8%      12.82        perf-stat.i.cpi
      9785            +7.2%      10490 ±  3%  perf-stat.i.cycles-between-cache-misses
      0.20            -0.0        0.18 ±  6%  perf-stat.i.iTLB-load-miss-rate%
  67639103           -11.5%   59865979 ±  6%  perf-stat.i.iTLB-load-misses
 3.324e+10            +1.4%  3.371e+10        perf-stat.i.iTLB-loads
 3.328e+10            +1.5%  3.379e+10        perf-stat.i.instructions
    492.13           +15.2%     566.96 ±  6%  perf-stat.i.instructions-per-iTLB-miss
      0.08            +1.8%       0.08        perf-stat.i.ipc
      6.46            -8.3%       5.93 ±  3%  perf-stat.overall.MPKI
      0.85            -0.1        0.77 ±  2%  perf-stat.overall.branch-miss-rate%
     13.05            -1.8%      12.82        perf-stat.overall.cpi
      9782            +7.1%      10480 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.20            -0.0        0.18 ±  6%  perf-stat.overall.iTLB-load-miss-rate%
    492.04           +15.2%     566.73 ±  6%  perf-stat.overall.instructions-per-iTLB-miss
      0.08            +1.8%       0.08        perf-stat.overall.ipc
  36640385           +21.7%   44597360        perf-stat.overall.path-length
 8.103e+09            +1.8%  8.249e+09        perf-stat.ps.branch-instructions
  68811380            -7.5%   63646780 ±  2%  perf-stat.ps.branch-misses
  44229668            -6.8%   41229829 ±  3%  perf-stat.ps.cache-misses
 2.142e+08            -6.9%  1.995e+08 ±  3%  perf-stat.ps.cache-references
      4233 ±  7%     -19.1%       3423        perf-stat.ps.context-switches
  67364890           -11.5%   59634016 ±  6%  perf-stat.ps.iTLB-load-misses
 3.315e+10            +1.5%  3.365e+10        perf-stat.ps.instructions
  9.98e+12            +1.5%  1.013e+13        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-5.0.0-rc2-00005-g570d020" of type "text/plain" (169082 bytes)

View attachment "job.yaml" of type "text/plain" (4736 bytes)

View attachment "reproduce" of type "text/plain" (311 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ