lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 8 Apr 2021 10:26:29 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Dave Chinner <dchinner@...hat.com>
Cc:     "Darrick J. Wong" <djwong@...nel.org>,
        Christoph Hellwig <hch@....de>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [xfs]  1fea323ff0:  aim7.jobs-per-min 2.4% improvement



Greeting,

FYI, we noticed a 2.4% improvement of aim7.jobs-per-min due to commit:


commit: 1fea323ff00526dcc04fbb4ee6e7d04e4e2ab0e1 ("xfs: reduce debug overhead of dir leaf/node checks")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


in testcase: aim7
on test machine: 88 threads Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:

	disk: 4BRD_12G
	md: RAID1
	fs: xfs
	test: disk_rw
	load: 3000
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/

In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------------------------------+
| testcase: change | aim7: aim7.jobs-per-min 1.6% improvement                               |
| test machine     | 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory |
| test parameters  | cpufreq_governor=performance                                           |
|                  | disk=1BRD_48G                                                          |
|                  | fs=xfs                                                                 |
|                  | load=3000                                                              |
|                  | test=disk_rw                                                           |
|                  | ucode=0x700001e                                                        |
+------------------+------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/4BRD_12G/xfs/x86_64-rhel-8.3/3000/RAID1/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/disk_rw/aim7/0x5003006

commit: 
  39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check")
  1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks")

39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :6           33%           2:6     kmsg.XFS(md#):xlog_verify_grant_tail:space>BBTOB(tail_blocks)
         %stddev     %change         %stddev
             \          |                \  
    505405            +2.4%     517621        aim7.jobs-per-min
     35.82            -2.4%      34.98        aim7.time.elapsed_time
     35.82            -2.4%      34.98        aim7.time.elapsed_time.max
      2866 ± 35%     +39.0%       3985 ±  3%  interrupts.CPU53.NMI:Non-maskable_interrupts
      2866 ± 35%     +39.0%       3985 ±  3%  interrupts.CPU53.PMI:Performance_monitoring_interrupts
    286711            -2.5%     279423        proc-vmstat.nr_dirty
    554636            -1.3%     547330        proc-vmstat.nr_file_pages
    286865            -2.5%     279593        proc-vmstat.nr_inactive_file
    286865            -2.5%     279593        proc-vmstat.nr_zone_inactive_file
    287057            -2.6%     279704        proc-vmstat.nr_zone_write_pending
 1.313e+10            +2.0%   1.34e+10        perf-stat.i.branch-instructions
     52558            +2.7%      53962        perf-stat.i.context-switches
      1942            +7.4%       2086 ±  2%  perf-stat.i.cpu-migrations
   1.9e+10            +2.0%  1.939e+10        perf-stat.i.dTLB-loads
 1.061e+10            +2.5%  1.087e+10        perf-stat.i.dTLB-stores
 6.606e+10            +2.1%  6.743e+10        perf-stat.i.instructions
    487.84            +2.1%     498.03        perf-stat.i.metric.M/sec
   3171946            +6.1%    3364545        perf-stat.i.node-store-misses
  10014711            +2.8%   10299278        perf-stat.i.node-stores
     24.04            +0.6       24.62        perf-stat.overall.node-store-miss-rate%
 1.286e+10            +2.0%  1.311e+10        perf-stat.ps.branch-instructions
     51473            +2.6%      52806        perf-stat.ps.context-switches
      1903            +7.2%       2040 ±  2%  perf-stat.ps.cpu-migrations
 1.861e+10            +2.0%  1.898e+10        perf-stat.ps.dTLB-loads
 1.039e+10            +2.4%  1.064e+10        perf-stat.ps.dTLB-stores
 6.469e+10            +2.0%  6.598e+10        perf-stat.ps.instructions
   3106311            +6.0%    3293166        perf-stat.ps.node-store-misses
   9812026            +2.8%   10082006        perf-stat.ps.node-stores
      2.29 ±  7%      -0.2        2.07 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.28 ±  7%      -0.2        2.06 ±  2%  perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.31 ±  7%      -0.2        2.10 ±  2%  perf-profile.calltrace.cycles-pp.unlink
      2.29 ±  7%      -0.2        2.08 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
      1.66 ±  8%      -0.2        1.48 ±  4%  perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.00 ±  6%      -0.2        1.83        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
      2.00 ±  6%      -0.2        1.83        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      1.98 ±  6%      -0.2        1.80 ±  2%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.00 ±  6%      -0.2        1.83        perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      2.00 ±  6%      -0.2        1.83        perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      1.97 ±  6%      -0.2        1.80 ±  2%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
      2.01 ±  6%      -0.2        1.84        perf-profile.calltrace.cycles-pp.creat64
      0.90 ± 11%      -0.1        0.79 ±  3%  perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.92 ±  9%      -0.1        0.81 ±  2%  perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open
      0.73 ± 11%      -0.1        0.63 ±  3%  perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2
      0.69 ±  6%      -0.1        0.61 ±  6%  perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.58 ±  8%      -0.3        2.29 ±  3%  perf-profile.children.cycles-pp.rwsem_down_write_slowpath
      2.29 ±  7%      -0.2        2.06 ±  2%  perf-profile.children.cycles-pp.do_unlinkat
      2.32 ±  7%      -0.2        2.10 ±  2%  perf-profile.children.cycles-pp.unlink
      1.62 ± 11%      -0.2        1.42 ±  3%  perf-profile.children.cycles-pp.rwsem_spin_on_owner
      2.08 ±  6%      -0.2        1.90        perf-profile.children.cycles-pp.do_sys_open
      2.04 ±  6%      -0.2        1.86 ±  2%  perf-profile.children.cycles-pp.do_filp_open
      2.07 ±  6%      -0.2        1.90        perf-profile.children.cycles-pp.do_sys_openat2
      2.02 ±  6%      -0.2        1.85 ±  2%  perf-profile.children.cycles-pp.creat64
      2.03 ±  6%      -0.2        1.86        perf-profile.children.cycles-pp.path_openat
      0.82 ±  6%      -0.1        0.72 ±  5%  perf-profile.children.cycles-pp.osq_lock
      0.18 ± 84%      -0.1        0.09 ±  9%  perf-profile.children.cycles-pp.xfs_vn_lookup
      0.50 ±  2%      -0.1        0.44 ±  2%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.14 ±  6%      -0.0        0.10 ±  7%  perf-profile.children.cycles-pp.write@plt
      0.12 ± 11%      -0.0        0.09        perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry
      0.11 ± 18%      -0.0        0.08 ±  8%  perf-profile.children.cycles-pp.xfs_dir_lookup
      0.22 ±  7%      -0.0        0.20 ±  2%  perf-profile.children.cycles-pp.update_process_times
      0.09 ±  8%      -0.0        0.07 ± 14%  perf-profile.children.cycles-pp.xfs_dir2_node_lookup
      0.25 ± 44%      +0.1        0.35 ±  5%  perf-profile.children.cycles-pp.xfs_file_llseek
      1.61 ± 11%      -0.2        1.41 ±  3%  perf-profile.self.cycles-pp.rwsem_spin_on_owner
      0.81 ±  6%      -0.1        0.72 ±  5%  perf-profile.self.cycles-pp.osq_lock
      0.48 ±  2%      -0.1        0.41 ±  3%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.10 ±  6%      -0.1        0.04 ± 44%  perf-profile.self.cycles-pp.write@plt
      0.24 ± 44%      +0.1        0.34 ±  5%  perf-profile.self.cycles-pp.xfs_file_llseek
      0.77 ± 13%      +0.2        0.94 ±  4%  perf-profile.self.cycles-pp.xfs_file_buffered_write


                                                                                
                                  aim7.jobs-per-min                             
                                                                                
  540000 +------------------------------------------------------------------+   
         |         O                                                        |   
  530000 |-+                               O  O O   O           O           |   
         | O O O O   O   O                            O       O             |   
         |             O   O   O O   O   O  O     O     O O O     O         |   
  520000 |-+                 O     O   O                            O O O   |   
         |                                                                O |   
  510000 |-+                                                   .+           |   
         |                                               .+.+.+             |   
  500000 |-+                                          +.+                   |   
         |                                           :                      |   
         |       +.    +         +              +.   :                      |   
  490000 |.+.+. +  +. + + .+.+. + +            +  +.+                       |   
         |     +     +   +     +   +.+.+.+.++.+                             |   
  480000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-cpl-4sp1: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/1BRD_48G/xfs/x86_64-rhel-8.3/3000/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/disk_rw/aim7/0x700001e

commit: 
  39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check")
  1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks")

39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    500977            +1.6%     509113        aim7.jobs-per-min
     36.14            -1.6%      35.57        aim7.time.elapsed_time
     36.14            -1.6%      35.57        aim7.time.elapsed_time.max
     40.93 ±  2%      -4.3%      39.19        aim7.time.user_time
     28267 ± 79%     -81.7%       5164 ±  5%  numa-meminfo.node2.KernelStack
     28180 ± 78%     -81.7%       5162 ±  5%  numa-vmstat.node2.nr_kernel_stack
    291109            -1.6%     286393        proc-vmstat.nr_dirty
     11049 ±  5%      +9.0%      12039 ±  4%  slabinfo.pde_opener.active_objs
     11049 ±  5%      +9.0%      12039 ±  4%  slabinfo.pde_opener.num_objs
      1579 ± 33%     +29.9%       2051 ± 25%  interrupts.CPU109.NMI:Non-maskable_interrupts
      1579 ± 33%     +29.9%       2051 ± 25%  interrupts.CPU109.PMI:Performance_monitoring_interrupts
      1785 ± 30%     +45.8%       2602 ±  8%  interrupts.CPU117.NMI:Non-maskable_interrupts
      1785 ± 30%     +45.8%       2602 ±  8%  interrupts.CPU117.PMI:Performance_monitoring_interrupts
    891.67 ±  8%     +99.4%       1778 ± 47%  interrupts.CPU4.CAL:Function_call_interrupts
 1.301e+10            +1.6%  1.322e+10        perf-stat.i.branch-instructions
     52509            +2.1%      53602        perf-stat.i.context-switches
  1.89e+10            +1.8%  1.924e+10        perf-stat.i.dTLB-loads
 1.061e+10            +1.9%  1.081e+10        perf-stat.i.dTLB-stores
 6.554e+10            +1.6%   6.66e+10        perf-stat.i.instructions
    296.86            +1.8%     302.18        perf-stat.i.metric.M/sec
     76.63            +1.0       77.63        perf-stat.i.node-load-miss-rate%
   3774653 ±  2%      +6.6%    4025641 ±  3%  perf-stat.i.node-loads
   4414091 ±  3%      +7.6%    4747750        perf-stat.i.node-store-misses
   9344160            +2.3%    9559103        perf-stat.i.node-stores
     32.07 ±  2%      +1.1       33.18        perf-stat.overall.node-store-miss-rate%
 1.271e+10            +1.8%  1.293e+10        perf-stat.ps.branch-instructions
     51278            +2.3%      52440        perf-stat.ps.context-switches
 1.846e+10            +2.0%  1.883e+10        perf-stat.ps.dTLB-loads
 1.036e+10            +2.1%  1.057e+10        perf-stat.ps.dTLB-stores
   6.4e+10            +1.8%  6.516e+10        perf-stat.ps.instructions
   3686827 ±  2%      +6.9%    3940762 ±  3%  perf-stat.ps.node-loads
   4310625 ±  3%      +7.8%    4645614        perf-stat.ps.node-store-misses
   9127740            +2.5%    9355048        perf-stat.ps.node-stores
      2.54            -0.2        2.30 ±  4%  perf-profile.calltrace.cycles-pp.creat64
      2.53            -0.2        2.29 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
      2.52            -0.2        2.28 ±  4%  perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      2.50            -0.2        2.26 ±  4%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.50            -0.2        2.26 ±  4%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
      2.52            -0.2        2.28 ±  4%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      2.52            -0.2        2.28 ±  4%  perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      2.82 ±  2%      -0.2        2.62 ±  3%  perf-profile.calltrace.cycles-pp.unlink
      2.79 ±  2%      -0.2        2.59 ±  3%  perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.80 ±  2%      -0.2        2.61 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
      2.79 ±  2%      -0.2        2.60 ±  3%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      2.10            -0.1        1.95 ±  4%  perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
      1.21 ±  3%      -0.1        1.10 ±  4%  perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open
      0.96 ±  5%      -0.1        0.87 ±  4%  perf-profile.calltrace.cycles-pp.xfs_generic_create.path_openat.do_filp_open.do_sys_openat2.do_sys_open
      1.07 ±  3%      -0.1        0.99 ±  3%  perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.92 ±  2%      -0.1        0.85 ±  4%  perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2
      3.31            -0.3        3.06 ±  4%  perf-profile.children.cycles-pp.rwsem_down_write_slowpath
      2.56            -0.2        2.31 ±  4%  perf-profile.children.cycles-pp.do_filp_open
      2.55 ±  2%      -0.2        2.30 ±  4%  perf-profile.children.cycles-pp.creat64
      2.56            -0.2        2.31 ±  4%  perf-profile.children.cycles-pp.path_openat
      2.60            -0.2        2.36 ±  4%  perf-profile.children.cycles-pp.do_sys_open
      2.60            -0.2        2.36 ±  4%  perf-profile.children.cycles-pp.do_sys_openat2
      2.83 ±  2%      -0.2        2.63 ±  3%  perf-profile.children.cycles-pp.unlink
      2.79 ±  2%      -0.2        2.60 ±  3%  perf-profile.children.cycles-pp.do_unlinkat
      1.99 ±  2%      -0.1        1.84 ±  3%  perf-profile.children.cycles-pp.rwsem_spin_on_owner
      0.96 ±  5%      -0.1        0.87 ±  4%  perf-profile.children.cycles-pp.xfs_generic_create
      0.45 ±  5%      -0.1        0.37 ±  7%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.17 ±  5%      -0.0        0.13 ±  9%  perf-profile.children.cycles-pp.write@plt
      0.12 ±  4%      -0.0        0.09 ±  5%  perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry
      0.09 ±  7%      -0.0        0.07 ±  7%  perf-profile.children.cycles-pp.generic_file_llseek_size
      0.09 ±  7%      -0.0        0.07 ± 10%  perf-profile.children.cycles-pp.xfs_dir2_node_lookup
      0.08 ± 11%      -0.0        0.06 ±  7%  perf-profile.children.cycles-pp.wake_up_q
      1.97 ±  2%      -0.2        1.82 ±  3%  perf-profile.self.cycles-pp.rwsem_spin_on_owner
      0.43 ±  5%      -0.1        0.34 ±  7%  perf-profile.self.cycles-pp.__fsnotify_parent
      1.17 ±  3%      -0.1        1.10 ±  4%  perf-profile.self.cycles-pp.write
      0.10 ±  7%      -0.1        0.05 ± 45%  perf-profile.self.cycles-pp.write@plt
      0.09 ±  7%      -0.0        0.07 ±  7%  perf-profile.self.cycles-pp.generic_file_llseek_size
      0.19 ±  3%      -0.0        0.17 ±  4%  perf-profile.self.cycles-pp.xfs_get_extsz_hint
      0.21 ±  6%      +0.0        0.24 ±  6%  perf-profile.self.cycles-pp.propagate_protected_usage





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc4-00020-g1fea323ff005" of type "text/plain" (172899 bytes)

View attachment "job-script" of type "text/plain" (8050 bytes)

View attachment "job.yaml" of type "text/plain" (5508 bytes)

View attachment "reproduce" of type "text/plain" (1026 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ