lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202401311609.2c8c0628-oliver.sang@intel.com>
Date: Wed, 31 Jan 2024 22:08:20 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Jens Axboe <axboe@...nel.dk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Bart Van Assche <bvanassche@....org>, <ying.huang@...el.com>,
	<feng.tang@...el.com>, <fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [block]  53889bcaf5:  stress-ng.ioprio.ops_per_sec
 13.0% improvement



Hello,

kernel test robot noticed a 13.0% improvement of stress-ng.ioprio.ops_per_sec on:


commit: 53889bcaf536b3abedeaf104019877cee37dd08b ("block: make __get_task_ioprio() easier to read")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 10%
	disk: 1HDD
	testtime: 60s
	fs: btrfs
	test: ioprio
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240131/202401311609.2c8c0628-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/ioprio/stress-ng/60s

commit: 
  3b7cb74547 ("block: move __get_task_ioprio() into header file")
  53889bcaf5 ("block: make __get_task_ioprio() easier to read")

3b7cb745473aec72 53889bcaf536b3abedeaf104019 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     31039 ± 50%     +62.9%      50565 ± 30%  numa-vmstat.node1.nr_anon_pages
      0.01 ± 20%     -35.7%       0.00 ± 21%  perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     70.30            -3.2%      68.07        turbostat.RAMWatt
     14275 ±109%    +173.4%      39022 ± 46%  numa-meminfo.node1.AnonHugePages
    124137 ± 50%     +62.9%     202248 ± 30%  numa-meminfo.node1.AnonPages
    111.17 ± 16%     -68.2%      35.33 ± 27%  perf-c2c.DRAM.local
    221.33 ±  5%     -56.2%      97.00 ± 10%  perf-c2c.DRAM.remote
    524510 ±  8%     +25.5%     658460 ± 10%  sched_debug.cpu.max_idle_balance_cost.max
      3643 ±165%    +494.7%      21671 ± 34%  sched_debug.cpu.max_idle_balance_cost.stddev
   4756555           +13.0%    5374750        stress-ng.ioprio.ops
     79272           +13.0%      89575        stress-ng.ioprio.ops_per_sec
      4.52           -31.6%       3.09 ±  8%  perf-stat.i.MPKI
 3.514e+09            +6.2%  3.734e+09        perf-stat.i.branch-instructions
      0.31 ±  6%      -0.1        0.25 ±  7%  perf-stat.i.branch-miss-rate%
  12495630 ±  4%     -12.9%   10889062 ±  6%  perf-stat.i.branch-misses
      9.01            -2.0        7.03 ±  8%  perf-stat.i.cache-miss-rate%
  73180840           -32.5%   49400992 ±  8%  perf-stat.i.cache-misses
 8.118e+08           -13.4%  7.029e+08        perf-stat.i.cache-references
      1.41            +1.6%       1.43        perf-stat.i.cpi
    328.29 ±  3%     +46.7%     481.44 ±  6%  perf-stat.i.cycles-between-cache-misses
 3.862e+09            +6.8%  4.123e+09        perf-stat.i.dTLB-loads
      0.00            -0.0        0.00 ±  4%  perf-stat.i.dTLB-store-miss-rate%
 1.695e+09           +11.5%  1.891e+09        perf-stat.i.dTLB-stores
 1.658e+10            -1.8%  1.628e+10        perf-stat.i.instructions
    154.35            +5.7%     163.21        perf-stat.i.metric.M/sec
   7769868 ±  5%     -66.9%    2575037 ±  8%  perf-stat.i.node-load-misses
   1891321 ± 12%     -67.4%     616722 ± 38%  perf-stat.i.node-loads
      4.42           -31.3%       3.03 ±  7%  perf-stat.overall.MPKI
      0.36 ±  5%      -0.1        0.29 ±  6%  perf-stat.overall.branch-miss-rate%
      9.02            -2.0        7.03 ±  8%  perf-stat.overall.cache-miss-rate%
      1.39            +1.6%       1.42        perf-stat.overall.cpi
    315.95           +48.7%     469.87 ±  7%  perf-stat.overall.cycles-between-cache-misses
      0.00            -0.0        0.00 ±  4%  perf-stat.overall.dTLB-store-miss-rate%
      0.72            -1.6%       0.71        perf-stat.overall.ipc
 3.455e+09            +6.2%  3.671e+09        perf-stat.ps.branch-instructions
  12275780 ±  4%     -12.8%   10699106 ±  6%  perf-stat.ps.branch-misses
  71968656           -32.5%   48582093 ±  8%  perf-stat.ps.cache-misses
 7.982e+08           -13.4%  6.912e+08        perf-stat.ps.cache-references
 3.797e+09            +6.8%  4.054e+09        perf-stat.ps.dTLB-loads
 1.667e+09           +11.5%  1.859e+09        perf-stat.ps.dTLB-stores
  1.63e+10            -1.8%  1.601e+10        perf-stat.ps.instructions
   7640492 ±  5%     -66.9%    2532136 ±  8%  perf-stat.ps.node-load-misses
   1859653 ± 12%     -67.4%     606316 ± 38%  perf-stat.ps.node-loads
 9.886e+11            -1.8%  9.707e+11        perf-stat.total.instructions
      0.59 ±  2%      +0.0        0.63 ±  2%  perf-profile.calltrace.cycles-pp.__generic_file_write_iter.generic_file_write_iter.do_iter_readv_writev.do_iter_write.vfs_writev
      0.63 ±  2%      +0.0        0.68 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fsync
      0.62 ±  4%      +0.0        0.68 ±  4%  perf-profile.calltrace.cycles-pp.import_iovec.vfs_writev.__x64_sys_pwritev.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.13            +0.1        1.18 ±  2%  perf-profile.calltrace.cycles-pp.filemap_get_entry.__filemap_get_folio.simple_write_begin.generic_perform_write.generic_file_write_iter
      1.52            +0.1        1.62        perf-profile.calltrace.cycles-pp.__filemap_get_folio.simple_write_begin.generic_perform_write.generic_file_write_iter.do_iter_readv_writev
      1.59            +0.1        1.69        perf-profile.calltrace.cycles-pp.simple_write_begin.generic_perform_write.generic_file_write_iter.do_iter_readv_writev.do_iter_write
      1.10 ±  3%      +0.1        1.22 ±  2%  perf-profile.calltrace.cycles-pp.security_task_getioprio.__do_sys_ioprio_get.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.39 ±  2%      +0.1        1.52 ±  3%  perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.generic_file_write_iter.do_iter_readv_writev
      1.60 ±  2%      +0.1        1.74 ±  2%  perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.generic_file_write_iter.do_iter_readv_writev.do_iter_write
      0.35 ± 70%      +0.2        0.54        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      1.60 ±  3%      +0.2        1.79 ±  4%  perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.generic_file_write_iter.do_iter_readv_writev.do_iter_write
      7.02            +0.6        7.62 ±  2%  perf-profile.calltrace.cycles-pp.generic_perform_write.generic_file_write_iter.do_iter_readv_writev.do_iter_write.vfs_writev
      8.19            +0.7        8.89 ±  2%  perf-profile.calltrace.cycles-pp.generic_file_write_iter.do_iter_readv_writev.do_iter_write.vfs_writev.__x64_sys_pwritev
      8.54            +0.7        9.26 ±  2%  perf-profile.calltrace.cycles-pp.do_iter_readv_writev.do_iter_write.vfs_writev.__x64_sys_pwritev.do_syscall_64
      9.20            +0.8        9.95 ±  2%  perf-profile.calltrace.cycles-pp.do_iter_write.vfs_writev.__x64_sys_pwritev.do_syscall_64.entry_SYSCALL_64_after_hwframe
     52.90 ±  2%      +6.3       59.21        perf-profile.calltrace.cycles-pp._raw_spin_lock.get_task_ioprio.__do_sys_ioprio_get.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.20 ±  3%      +0.0        0.22 ±  2%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.14 ±  7%      +0.0        0.18 ± 13%  perf-profile.children.cycles-pp.up_write
      0.30 ±  5%      +0.0        0.33 ±  5%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.42 ±  2%      +0.0        0.46 ±  3%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.61 ±  2%      +0.0        0.66 ±  3%  perf-profile.children.cycles-pp.__generic_file_write_iter
      1.15            +0.1        1.20 ±  2%  perf-profile.children.cycles-pp.filemap_get_entry
      0.32 ±  4%      +0.1        0.38 ±  8%  perf-profile.children.cycles-pp.__radix_tree_lookup
      0.90 ±  2%      +0.1        0.97        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.42 ±  4%      +0.1        0.49 ±  6%  perf-profile.children.cycles-pp.find_task_by_vpid
      1.56            +0.1        1.64        perf-profile.children.cycles-pp.__filemap_get_folio
      1.60            +0.1        1.70        perf-profile.children.cycles-pp.simple_write_begin
      1.49 ±  2%      +0.1        1.62 ±  3%  perf-profile.children.cycles-pp.fault_in_readable
      1.64 ±  2%      +0.2        1.80 ±  3%  perf-profile.children.cycles-pp.fault_in_iov_iter_readable
      1.45 ±  3%      +0.2        1.63        perf-profile.children.cycles-pp.security_task_getioprio
      1.61 ±  3%      +0.2        1.80 ±  4%  perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      7.08            +0.6        7.68 ±  2%  perf-profile.children.cycles-pp.generic_perform_write
      8.22            +0.7        8.92 ±  2%  perf-profile.children.cycles-pp.generic_file_write_iter
      8.56            +0.7        9.27 ±  2%  perf-profile.children.cycles-pp.do_iter_readv_writev
      9.21            +0.8        9.97 ±  2%  perf-profile.children.cycles-pp.do_iter_write
     62.57            +1.3       63.83        perf-profile.children.cycles-pp.get_task_ioprio
     53.95 ±  2%      +6.2       60.16        perf-profile.children.cycles-pp._raw_spin_lock
      9.16 ±  5%      -5.0        4.15 ±  2%  perf-profile.self.cycles-pp.get_task_ioprio
      9.84 ±  9%      -2.2        7.62 ±  4%  perf-profile.self.cycles-pp.__do_sys_ioprio_get
      0.16 ±  5%      +0.0        0.18 ±  4%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.18 ±  2%      +0.0        0.21 ±  5%  perf-profile.self.cycles-pp.fault_in_iov_iter_readable
      0.49            +0.0        0.52 ±  2%  perf-profile.self.cycles-pp.filemap_get_entry
      0.29 ±  6%      +0.0        0.33 ±  6%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.88 ±  2%      +0.1        0.93        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.31 ±  3%      +0.1        0.37 ±  8%  perf-profile.self.cycles-pp.__radix_tree_lookup
      1.08 ±  4%      +0.1        1.21        perf-profile.self.cycles-pp.security_task_getioprio
      1.44 ±  2%      +0.1        1.57 ±  3%  perf-profile.self.cycles-pp.fault_in_readable
      1.60 ±  3%      +0.2        1.79 ±  4%  perf-profile.self.cycles-pp.copy_page_from_iter_atomic
     48.20 ±  4%      +7.5       55.66        perf-profile.self.cycles-pp._raw_spin_lock




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ