lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181127060102.GF6163@shao2-debian>
Date:   Tue, 27 Nov 2018 14:01:02 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     NeilBrown <neilb@...e.com>
Cc:     Jeff Layton <jlayton@...nel.org>,
        "J. Bruce Fields" <bfields@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Jeff Layton <jlayton@...hat.com>, lkp@...org
Subject: [LKP] [fs/locks]  83b381078b:  will-it-scale.per_thread_ops -62.5%
 regression

Greeting,

FYI, we noticed a -62.5% regression of will-it-scale.per_thread_ops due to commit:


commit: 83b381078b5ecab098ebf6bc9548bb32af1dbf31 ("fs/locks: always delete_block after waiting.")
https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git locks-next

in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:

	nr_task: 16
	mode: thread
	test: lock1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -65.5% regression         |
| test machine     | 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory |
| test parameters  | cpufreq_governor=performance                                          |
|                  | mode=thread                                                           |
|                  | nr_task=100%                                                          |
|                  | test=lock1                                                            |
|                  | ucode=0x3d                                                            |
+------------------+-----------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/thread/16/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3d/lock1/will-it-scale

commit: 
  c5420ab794 ("fs/locks: allow a lock request to block other requests.")
  83b381078b ("fs/locks: always delete_block after waiting.")

c5420ab794c1a3a9 83b381078b5ecab098ebf6bc95 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    372966 ±  7%     -62.5%     140024        will-it-scale.per_thread_ops
      4601            +2.9%       4736        will-it-scale.time.system_time
    213.95 ±  6%     -62.9%      79.47        will-it-scale.time.user_time
   5967471 ±  7%     -62.5%    2240400        will-it-scale.workload
    681.35            +1.1%     688.54        boot-time.idle
      3.03 ±  6%      -1.8        1.20 ±  2%  mpstat.cpu.usr%
    230349 ± 21%     -26.6%     169168 ± 26%  softirqs.RCU
    130.28            -2.5%     126.96        turbostat.PkgWatt
      8.77 ±  2%      -3.0%       8.51        turbostat.RAMWatt
    787.75 ± 21%     +24.3%     978.96 ±  4%  sched_debug.cfs_rq:/.util_est_enqueued.max
     16345 ±  8%     -45.2%       8953 ± 17%  sched_debug.cpu.ttwu_local.max
      3555 ± 10%     -28.9%       2527 ±  8%  sched_debug.cpu.ttwu_local.stddev
 1.357e+12 ±  3%     -22.5%  1.052e+12        perf-stat.branch-instructions
      0.60            -0.2        0.44 ±  3%  perf-stat.branch-miss-rate%
 8.095e+09 ±  3%     -42.7%  4.642e+09 ±  3%  perf-stat.branch-misses
     44.72            -1.3       43.45        perf-stat.cache-miss-rate%
 1.006e+10 ± 19%     -20.3%  8.018e+09 ±  2%  perf-stat.cache-misses
 2.252e+10 ± 19%     -18.1%  1.845e+10 ±  2%  perf-stat.cache-references
      2.43 ±  3%     +36.2%       3.31        perf-stat.cpi
      0.00 ± 10%      +0.0        0.00 ±  8%  perf-stat.dTLB-load-miss-rate%
 1.699e+12 ±  3%     -30.3%  1.185e+12        perf-stat.dTLB-loads
 8.194e+11 ±  7%     -59.3%  3.337e+11        perf-stat.dTLB-stores
 4.037e+09 ±  4%     -65.3%  1.403e+09        perf-stat.iTLB-load-misses
 5.873e+08 ± 12%     -62.2%  2.223e+08 ± 17%  perf-stat.iTLB-loads
 6.141e+12 ±  3%     -26.9%  4.489e+12        perf-stat.instructions
      1522 ±  2%    +110.3%       3201        perf-stat.instructions-per-iTLB-miss
      0.41 ±  3%     -26.6%       0.30        perf-stat.ipc
     82.22            -2.5       79.75        perf-stat.node-store-miss-rate%
 2.253e+09 ±  2%     -16.0%  1.894e+09        perf-stat.node-store-misses
   1031848 ±  3%     +94.2%    2003878        perf-stat.path-length
     40.20 ± 29%     -39.2        0.96 ±  8%  perf-profile.calltrace.cycles-pp._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
     38.31 ± 29%     -38.3        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl
      2.46 ±  4%      -1.5        0.94 ± 11%  perf-profile.calltrace.cycles-pp.locks_alloc_lock.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
      2.26 ±  5%      -1.4        0.83 ±  8%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      2.15 ±  5%      -1.3        0.82 ± 12%  perf-profile.calltrace.cycles-pp.kmem_cache_alloc.locks_alloc_lock.posix_lock_inode.do_lock_file_wait.fcntl_setlk
      1.85 ± 30%      -1.3        0.54 ±  3%  perf-profile.calltrace.cycles-pp.fput.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.07 ±  6%      -1.3        0.77 ±  4%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      1.20 ±  3%      -0.9        0.27 ±100%  perf-profile.calltrace.cycles-pp.locks_alloc_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
     67.78            +8.7       76.49 ±  8%  perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     66.76            +9.3       76.09 ±  8%  perf-profile.calltrace.cycles-pp.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +46.1       46.09 ±  8%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.locks_delete_block.do_lock_file_wait.fcntl_setlk
      0.00           +47.0       47.01 ±  8%  perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_delete_block.do_lock_file_wait.fcntl_setlk.do_fcntl
      0.00           +47.1       47.10 ±  8%  perf-profile.calltrace.cycles-pp.locks_delete_block.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
     24.15 ± 46%     +49.8       73.98 ±  8%  perf-profile.calltrace.cycles-pp.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
      3.74 ±  4%      -2.3        1.44 ± 11%  perf-profile.children.cycles-pp.locks_alloc_lock
      3.28 ±  4%      -2.0        1.27 ± 12%  perf-profile.children.cycles-pp.kmem_cache_alloc
      2.42 ±  6%      -1.5        0.90 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      2.26 ±  5%      -1.4        0.83 ±  8%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      1.87 ± 30%      -1.3        0.55 ±  3%  perf-profile.children.cycles-pp.fput
      1.37 ± 29%      -1.1        0.29 ±  9%  perf-profile.children.cycles-pp.__fget_light
      1.58 ±  5%      -1.0        0.60 ±  6%  perf-profile.children.cycles-pp.file_has_perm
      1.07 ± 30%      -0.9        0.22 ± 11%  perf-profile.children.cycles-pp.__fget
      1.17 ±  6%      -0.7        0.45 ± 14%  perf-profile.children.cycles-pp.memset_erms
      1.08 ±  4%      -0.7        0.42 ± 11%  perf-profile.children.cycles-pp.security_file_lock
      1.01 ±  7%      -0.6        0.40 ±  7%  perf-profile.children.cycles-pp.security_file_fcntl
      0.89 ±  3%      -0.5        0.34 ±  7%  perf-profile.children.cycles-pp._copy_from_user
      0.85 ±  4%      -0.5        0.30 ±  7%  perf-profile.children.cycles-pp.avc_has_perm
      0.94 ±  9%      -0.4        0.54 ± 15%  perf-profile.children.cycles-pp.kmem_cache_free
      0.66 ±  4%      -0.4        0.26 ± 13%  perf-profile.children.cycles-pp.___might_sleep
      0.40 ±  2%      -0.2        0.15 ±  3%  perf-profile.children.cycles-pp.copy_user_generic_unrolled
      0.40 ±  3%      -0.2        0.17 ± 18%  perf-profile.children.cycles-pp.__might_sleep
      0.37 ±  7%      -0.2        0.14 ± 11%  perf-profile.children.cycles-pp.locks_dispose_list
      0.33 ±  4%      -0.2        0.12 ±  7%  perf-profile.children.cycles-pp.locks_delete_lock_ctx
      0.27 ±  5%      -0.2        0.10 ± 12%  perf-profile.children.cycles-pp._cond_resched
      0.28            -0.2        0.12 ± 10%  perf-profile.children.cycles-pp.selinux_file_lock
      0.26 ±  6%      -0.1        0.11 ± 13%  perf-profile.children.cycles-pp.__might_fault
      0.23 ±  6%      -0.1        0.08 ±  5%  perf-profile.children.cycles-pp.inode_has_perm
      0.22 ±  6%      -0.1        0.08 ± 13%  perf-profile.children.cycles-pp.locks_unlink_lock_ctx
      0.14 ±  7%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.rcu_all_qs
      0.16 ±  7%      -0.1        0.06 ± 11%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.12 ±  3%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.should_failslab
      0.13 ±  3%      -0.1        0.05 ±  8%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.10 ± 17%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.selinux_file_fcntl
      0.11 ±  4%      -0.1        0.04 ± 57%  perf-profile.children.cycles-pp.flock64_to_posix_lock
     67.83            +8.7       76.51 ±  8%  perf-profile.children.cycles-pp.do_fcntl
     66.80            +9.3       76.11 ±  8%  perf-profile.children.cycles-pp.fcntl_setlk
     58.30           +14.2       72.52 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock
     53.61 ±  2%     +15.5       69.06 ±  8%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.00           +47.1       47.10 ±  8%  perf-profile.children.cycles-pp.locks_delete_block
     24.19 ± 46%     +49.8       73.99 ±  8%  perf-profile.children.cycles-pp.do_lock_file_wait
      2.42 ±  6%      -1.5        0.90 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      2.26 ±  5%      -1.4        0.83 ±  8%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      1.86 ± 30%      -1.3        0.55 ±  3%  perf-profile.self.cycles-pp.fput
      4.66 ±  8%      -1.2        3.45 ±  7%  perf-profile.self.cycles-pp._raw_spin_lock
      1.06 ± 31%      -0.8        0.21 ± 14%  perf-profile.self.cycles-pp.__fget
      1.22 ±  4%      -0.8        0.46 ± 11%  perf-profile.self.cycles-pp.kmem_cache_alloc
      1.14 ±  5%      -0.7        0.43 ± 14%  perf-profile.self.cycles-pp.memset_erms
      0.84 ±  3%      -0.5        0.30 ±  7%  perf-profile.self.cycles-pp.avc_has_perm
      0.91 ±  9%      -0.4        0.46 ± 13%  perf-profile.self.cycles-pp.kmem_cache_free
      0.65 ± 13%      -0.4        0.21 ±  5%  perf-profile.self.cycles-pp.fcntl_setlk
      0.63 ±  5%      -0.4        0.26 ± 12%  perf-profile.self.cycles-pp.___might_sleep
      0.62 ±  5%      -0.4        0.24 ± 14%  perf-profile.self.cycles-pp.posix_lock_inode
      0.48 ±  9%      -0.3        0.20 ± 11%  perf-profile.self.cycles-pp.file_has_perm
      0.40 ±  4%      -0.3        0.14 ±  8%  perf-profile.self.cycles-pp.locks_alloc_lock
      0.41 ±  7%      -0.3        0.15 ± 13%  perf-profile.self.cycles-pp.__x64_sys_fcntl
      0.37            -0.2        0.14 ±  6%  perf-profile.self.cycles-pp.copy_user_generic_unrolled
      0.30 ± 47%      -0.2        0.08 ± 10%  perf-profile.self.cycles-pp.__fget_light
      0.36 ±  4%      -0.2        0.15 ± 18%  perf-profile.self.cycles-pp.__might_sleep
      0.28 ±  4%      -0.2        0.11 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      0.25 ±  3%      -0.1        0.11 ± 11%  perf-profile.self.cycles-pp.selinux_file_lock
      0.20 ±  7%      -0.1        0.07 ± 17%  perf-profile.self.cycles-pp.do_lock_file_wait
      0.21 ± 11%      -0.1        0.08 ± 10%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.20 ±  7%      -0.1        0.08 ±  5%  perf-profile.self.cycles-pp.inode_has_perm
      0.19 ±  3%      -0.1        0.07 ±  5%  perf-profile.self.cycles-pp.do_fcntl
      0.15 ±  8%      -0.1        0.05 ±  8%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      0.11 ±  4%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp._copy_from_user
      0.12 ±  4%      -0.1        0.04 ± 58%  perf-profile.self.cycles-pp._cond_resched
      0.20 ± 11%      -0.1        0.15 ± 16%  perf-profile.self.cycles-pp.locks_free_lock
     53.42 ±  2%     +15.4       68.84 ±  8%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath


                                                                                
                            will-it-scale.per_thread_ops                        
                                                                                
  450000 +-+----------------------------------------------------------------+   
         |                                                                  |   
  400000 +-+   +..+..  .+..+..  .+..+..+...+..+..+..   +..        .+..    ..|   
  350000 +-+ ..      +.       +.                     ..         +.    +..+  |   
         |  +                                       +     +     :           |   
  300000 +-+                                              :    :            |   
  250000 +-+                                               :   :            |   
         |                                                 :   :            |   
  200000 +-+                                               :   :            |   
  150000 +-+                                                : :             |   
         O  O  O  O  O  O  O  O  O  O  O   O  O  O  O  O  O :O: O  O  O  O  O   
  100000 +-+                                                : :             |   
   50000 +-+                                                : :             |   
         |                                                   :              |   
       0 +-+----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               will-it-scale.workload                           
                                                                                
  7e+06 +-+-----------------------------------------------------------------+   
        |              +...+..           .+..+..+     +            +..      |   
  6e+06 +-+   +..+.. ..         .+..+..+.        +   + +         ..       ..|   
        |   ..      +         +.                  + +   +       +     +..+  |   
  5e+06 +-++                                       +     +      :           |   
        |                                                :     :            |   
  4e+06 +-+                                               :    :            |   
        |                                                 :    :            |   
  3e+06 +-+                                                :   :            |   
        |                                       O     O    :  : O  O        |   
  2e+06 O-+O  O  O  O  O   O  O  O  O  O  O  O     O     O : O:       O  O  O   
        |                                                   : :             |   
  1e+06 +-+                                                 : :             |   
        |                                                    :              |   
      0 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                           will-it-scale.time.user_time                         
                                                                                
  250 +-+-------------------------------------------------------------------+   
      |                 .+..               .+..                   +..       |   
      |..   +...+..  .+.      .+...+..+..+.    +..    +..       ..   .    ..|   
  200 +-+ ..       +.       +.                    . ..         +      +..+  |   
      |  +                                         +     +     :            |   
      |                                                  :    :             |   
  150 +-+                                                 :   :             |   
      |                                                   :   :             |   
  100 +-+                                                 :   :             |   
      |  O      O                                          : :              |   
      O     O      O  O  O  O  O   O  O  O  O  O   O  O  O :O: O  O   O  O  O   
   50 +-+                                                  : :              |   
      |                                                    : :              |   
      |                                                     :               |   
    0 +-+-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                          will-it-scale.time.system_time                        
                                                                                
  5000 +-+------------------------------------------------------------------+   
  4500 O-+O..O..O...O..O..O..O..O..O..O...O..O..O..O..O..O  O  O...O..O..O..O   
       |                                                 :     :            |   
  4000 +-+                                               :     :            |   
  3500 +-+                                                :   :             |   
       |                                                  :   :             |   
  3000 +-+                                                :   :             |   
  2500 +-+                                                :   :             |   
  2000 +-+                                                 : :              |   
       |                                                   : :              |   
  1500 +-+                                                 : :              |   
  1000 +-+                                                 : :              |   
       |                                                    :               |   
   500 +-+                                                  :               |   
     0 +-+------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-hsw-ep4: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-hsw-ep4/lock1/will-it-scale/0x3d

commit: 
  c5420ab794 ("fs/locks: allow a lock request to block other requests.")
  83b381078b ("fs/locks: always delete_block after waiting.")

c5420ab794c1a3a9 83b381078b5ecab098ebf6bc95 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     81037           -65.5%      27921        will-it-scale.per_thread_ops
    138477            -2.6%     134943        will-it-scale.time.involuntary_context_switches
      7836            +1.6%       7964        will-it-scale.time.maximum_resident_set_size
    268.13 ±  2%     -64.8%      94.41        will-it-scale.time.user_time
   5834761           -65.5%    2010357        will-it-scale.workload
      1.26 ±  2%      -0.8        0.45 ±  4%  mpstat.cpu.usr%
     97.58 ±  8%     -15.7%      82.29 ±  5%  sched_debug.cpu.ttwu_count.min
    256.50            -2.3%     250.49 ±  2%  turbostat.PkgWatt
      1121            -5.2%       1062        vmstat.system.cs
     31.52 ± 10%     -12.2%      27.67 ± 10%  boot-time.boot
     25.82 ± 15%     -12.5%      22.59 ± 12%  boot-time.dhcp
      1794 ± 11%     -15.4%       1517 ±  9%  boot-time.idle
    111720 ± 26%     +34.7%     150535 ±  9%  numa-meminfo.node0.Active
    111718 ± 26%     +34.7%     150533 ±  9%  numa-meminfo.node0.Active(anon)
     91357 ± 24%     +44.8%     132283 ± 15%  numa-meminfo.node0.AnonPages
     27936 ± 26%     +34.7%      37633 ±  9%  numa-vmstat.node0.nr_active_anon
     22830 ± 24%     +44.8%      33067 ± 15%  numa-vmstat.node0.nr_anon_pages
     27936 ± 26%     +34.7%      37633 ±  9%  numa-vmstat.node0.nr_zone_active_anon
      1482 ±  9%     +11.1%       1647 ±  5%  slabinfo.UNIX.active_objs
      1482 ±  9%     +11.1%       1647 ±  5%  slabinfo.UNIX.num_objs
    399.00 ±  5%     -28.9%     283.50 ±  6%  slabinfo.kmem_cache.active_objs
    399.00 ±  5%     -28.9%     283.50 ±  6%  slabinfo.kmem_cache.num_objs
    686.00 ±  4%     -25.7%     510.00 ±  5%  slabinfo.kmem_cache_node.active_objs
    736.00 ±  4%     -23.9%     560.00 ±  4%  slabinfo.kmem_cache_node.num_objs
    651.00 ±  7%     -14.5%     556.50 ±  8%  slabinfo.mnt_cache.active_objs
    651.00 ±  7%     -14.5%     556.50 ±  8%  slabinfo.mnt_cache.num_objs
      1097 ± 12%     +19.6%       1311 ±  5%  slabinfo.task_group.active_objs
      1097 ± 12%     +19.6%       1311 ±  5%  slabinfo.task_group.num_objs
 3.766e+12            -8.2%  3.457e+12        perf-stat.branch-instructions
      0.26            -0.1        0.16 ±  4%  perf-stat.branch-miss-rate%
 9.894e+09           -43.1%  5.628e+09 ±  4%  perf-stat.branch-misses
 4.835e+09 ±  4%     -13.0%  4.208e+09        perf-stat.cache-misses
 1.155e+10 ±  4%     -11.8%  1.019e+10 ±  4%  perf-stat.cache-references
      3.77           +12.4%       4.24        perf-stat.cpi
 4.104e+12           -12.7%  3.582e+12        perf-stat.dTLB-loads
      0.01 ± 55%      +0.0        0.01 ± 11%  perf-stat.dTLB-store-miss-rate%
 8.153e+11           -61.3%  3.158e+11        perf-stat.dTLB-stores
 3.104e+09 ±  3%     -63.0%  1.148e+09 ±  5%  perf-stat.iTLB-load-misses
 4.672e+08 ± 16%     -60.7%  1.836e+08 ± 16%  perf-stat.iTLB-loads
 1.578e+13           -10.6%   1.41e+13        perf-stat.instructions
      5087 ±  3%    +142.2%      12320 ±  5%  perf-stat.instructions-per-iTLB-miss
      0.27           -11.0%       0.24        perf-stat.ipc
     98.73            +1.2       99.89        perf-stat.node-load-miss-rate%
  28488920 ± 58%     -92.2%    2213871 ± 20%  perf-stat.node-loads
     77.85            -6.5       71.39        perf-stat.node-store-miss-rate%
 2.072e+09 ±  2%     -27.5%  1.503e+09        perf-stat.node-store-misses
   2704766          +159.3%    7014271        perf-stat.path-length
     62.34 ± 35%     -62.3        0.00        perf-profile.calltrace.cycles-pp._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
     61.88 ± 35%     -61.9        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl
     98.52            +0.9       99.42        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     98.48            +0.9       99.39        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     98.32            +1.0       99.34        perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     97.16            +1.8       99.00        perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     96.91            +2.0       98.91        perf-profile.calltrace.cycles-pp.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     33.77 ± 65%     +64.4       98.20        perf-profile.calltrace.cycles-pp.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
      0.00           +64.5       64.55        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.locks_delete_block.do_lock_file_wait.fcntl_setlk
      0.00           +64.9       64.89        perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_delete_block.do_lock_file_wait.fcntl_setlk.do_fcntl
      0.00           +64.9       64.92        perf-profile.calltrace.cycles-pp.locks_delete_block.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
      1.37 ±  4%      -0.8        0.53        perf-profile.children.cycles-pp.locks_alloc_lock
      1.19 ±  4%      -0.7        0.46        perf-profile.children.cycles-pp.kmem_cache_alloc
      0.73 ±  6%      -0.5        0.28 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.66 ±  6%      -0.4        0.24 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.44 ± 24%      -0.4        0.07        perf-profile.children.cycles-pp.fput
      0.49 ±  6%      -0.3        0.18 ±  2%  perf-profile.children.cycles-pp.file_has_perm
      0.40 ±  5%      -0.2        0.15        perf-profile.children.cycles-pp.memset_erms
      0.36 ±  7%      -0.2        0.12 ±  4%  perf-profile.children.cycles-pp.security_file_lock
      0.31 ±  9%      -0.2        0.12 ±  3%  perf-profile.children.cycles-pp.security_file_fcntl
      0.28 ± 15%      -0.2        0.09 ±  4%  perf-profile.children.cycles-pp.__fget_light
      0.25 ±  6%      -0.2        0.09        perf-profile.children.cycles-pp.avc_has_perm
      0.25 ±  5%      -0.2        0.10 ±  5%  perf-profile.children.cycles-pp.___might_sleep
      0.20 ±  3%      -0.1        0.07 ±  5%  perf-profile.children.cycles-pp.__fget
      0.19 ± 11%      -0.1        0.07 ±  6%  perf-profile.children.cycles-pp._copy_from_user
      0.15 ±  9%      -0.1        0.06        perf-profile.children.cycles-pp.__might_sleep
      0.13 ±  8%      -0.1        0.04 ± 57%  perf-profile.children.cycles-pp.locks_dispose_list
      0.29 ±  7%      -0.1        0.24 ±  5%  perf-profile.children.cycles-pp.kmem_cache_free
      0.13 ±  9%      -0.0        0.10 ±  8%  perf-profile.children.cycles-pp.locks_free_lock
      0.36 ±  3%      +0.0        0.39        perf-profile.children.cycles-pp.apic_timer_interrupt
     98.56            +0.9       99.44        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     98.49            +0.9       99.42        perf-profile.children.cycles-pp.do_syscall_64
     98.34            +1.0       99.34        perf-profile.children.cycles-pp.__x64_sys_fcntl
     97.17            +1.8       99.00        perf-profile.children.cycles-pp.do_fcntl
     96.93            +2.0       98.92        perf-profile.children.cycles-pp.fcntl_setlk
     93.97            +3.7       97.63        perf-profile.children.cycles-pp._raw_spin_lock
     92.80            +4.0       96.79        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     33.79 ± 65%     +64.4       98.20        perf-profile.children.cycles-pp.do_lock_file_wait
      0.00           +64.9       64.93        perf-profile.children.cycles-pp.locks_delete_block
      0.73 ±  6%      -0.5        0.28 ±  3%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.66 ±  6%      -0.4        0.24 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.44 ± 24%      -0.4        0.07 ±  6%  perf-profile.self.cycles-pp.fput
      1.17 ± 14%      -0.3        0.84 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock
      0.45 ±  5%      -0.3        0.17 ±  4%  perf-profile.self.cycles-pp.kmem_cache_alloc
      0.39 ±  5%      -0.2        0.15 ±  3%  perf-profile.self.cycles-pp.memset_erms
      0.24 ±  3%      -0.2        0.08 ±  5%  perf-profile.self.cycles-pp.posix_lock_inode
      0.24 ±  6%      -0.2        0.09 ±  4%  perf-profile.self.cycles-pp.avc_has_perm
      0.24 ±  6%      -0.2        0.09        perf-profile.self.cycles-pp.___might_sleep
      0.19 ±  2%      -0.1        0.07        perf-profile.self.cycles-pp.__fget
      0.17 ±  7%      -0.1        0.06 ±  6%  perf-profile.self.cycles-pp.locks_alloc_lock
      0.16 ± 12%      -0.1        0.06 ±  7%  perf-profile.self.cycles-pp.file_has_perm
      0.17 ±  7%      -0.1        0.07 ± 10%  perf-profile.self.cycles-pp.fcntl_setlk
      0.12 ±  8%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.do_syscall_64
      0.14 ±  9%      -0.1        0.05 ±  8%  perf-profile.self.cycles-pp.__might_sleep
      0.29 ±  6%      -0.1        0.21 ±  5%  perf-profile.self.cycles-pp.kmem_cache_free
      0.13 ±  8%      -0.1        0.05 ±  8%  perf-profile.self.cycles-pp.__x64_sys_fcntl
      0.08 ± 12%      -0.0        0.06 ±  7%  perf-profile.self.cycles-pp.locks_free_lock
     92.45            +3.9       96.37        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.20.0-rc2-00008-g83b3810" of type "text/plain" (168529 bytes)

View attachment "job-script" of type "text/plain" (6924 bytes)

View attachment "job.yaml" of type "text/plain" (4578 bytes)

View attachment "reproduce" of type "text/plain" (308 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ