lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202501101058.cd8beeba-lkp@intel.com>
Date: Fri, 10 Jan 2025 11:14:51 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
	<linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [viro-vfs:work.d_revalidate] [dcache]  077ab1260a:
 will-it-scale.per_process_ops 1.9% improvement



Hello,

kernel test robot noticed a 1.9% improvement of will-it-scale.per_process_ops on:


commit: 077ab1260a52068a62a5fb08fa2c5f1d0dcf2738 ("dcache: back inline names with a struct-wrapped array of unsigned long")
https://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git work.d_revalidate

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: process
	test: poll2
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250110/202501101058.cd8beeba-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale

commit: 
  cf0cc84299 ("make sure that DNAME_INLINE_LEN is a multiple of word size")
  077ab1260a ("dcache: back inline names with a struct-wrapped array of unsigned long")

cf0cc842995ca3da 077ab1260a52068a62a5fb08fa2 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    294.00 ± 10%     +15.2%     338.67 ±  5%  perf-c2c.DRAM.remote
    243.33 ±  9%     +13.7%     276.67 ±  6%  perf-c2c.HITM.remote
     21502 ±  5%    +413.7%     110453 ±117%  sched_debug.cfs_rq:/.load.max
      2543 ±  6%    +336.8%      11109 ±111%  sched_debug.cfs_rq:/.load.stddev
    274.83 ± 19%     +28.8%     353.86 ±  6%  sched_debug.cfs_rq:/.util_est.min
  24387540            +1.9%   24841387        will-it-scale.104.processes
    234495            +1.9%     238859        will-it-scale.per_process_ops
  24387540            +1.9%   24841387        will-it-scale.workload
      0.85 ± 11%     -20.5%       0.68 ± 10%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
      1.71 ± 11%     -20.6%       1.36 ± 10%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
     38.41 ±104%     -78.0%       8.46        perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      3676 ± 13%     -34.3%       2415 ± 21%  perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.85 ± 11%     -20.5%       0.68 ± 10%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
      3676 ± 13%     -34.3%       2415 ± 21%  perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
 4.591e+10            +1.9%  4.676e+10        perf-stat.i.branch-instructions
 1.367e+08            +1.9%  1.392e+08        perf-stat.i.branch-misses
      1.08            -1.9%       1.06        perf-stat.i.cpi
 2.584e+11            +1.9%  2.632e+11        perf-stat.i.instructions
      0.92            +1.9%       0.94        perf-stat.i.ipc
      1.08            -1.8%       1.06        perf-stat.overall.cpi
      0.93            +1.9%       0.94        perf-stat.overall.ipc
 4.575e+10            +1.9%   4.66e+10        perf-stat.ps.branch-instructions
 1.362e+08            +1.9%  1.388e+08        perf-stat.ps.branch-misses
 2.575e+11            +1.9%  2.623e+11        perf-stat.ps.instructions
 7.785e+13            +1.9%   7.93e+13        perf-stat.total.instructions
     59.17            -1.5       57.63        perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     71.18            -1.4       69.76        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     70.73            -1.4       69.32        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     72.76            -1.3       71.48        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     76.80            -1.1       75.70        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
     43.66            -1.1       42.61        perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
     94.61            -0.2       94.40        perf-profile.calltrace.cycles-pp.__poll
      0.92            +0.0        0.94        perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.66            +0.1        2.73        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll
      4.90            +0.2        5.10        perf-profile.calltrace.cycles-pp.testcase
      5.81            +0.2        6.04        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll
      1.98 ±  3%      +0.3        2.26 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll
      7.25            +0.3        7.56        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll
     59.29            -1.6       57.72        perf-profile.children.cycles-pp.do_poll
     71.24            -1.4       69.83        perf-profile.children.cycles-pp.__x64_sys_poll
     70.82            -1.4       69.41        perf-profile.children.cycles-pp.do_sys_poll
     72.83            -1.3       71.55        perf-profile.children.cycles-pp.do_syscall_64
     76.94            -1.1       75.84        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     43.57            -1.0       42.53        perf-profile.children.cycles-pp.fdget
     95.18            -0.2       94.97        perf-profile.children.cycles-pp.__poll
      1.16 ±  2%      +0.2        1.32 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      3.50            +0.2        3.69        perf-profile.children.cycles-pp.entry_SYSCALL_64
      4.91            +0.2        5.12        perf-profile.children.cycles-pp.testcase
      6.22            +0.2        6.46        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      7.31            +0.3        7.62        perf-profile.children.cycles-pp.syscall_return_via_sysret
     42.16            -1.0       41.16        perf-profile.self.cycles-pp.fdget
     16.86            -0.6       16.30        perf-profile.self.cycles-pp.do_poll
      0.90            +0.0        0.93        perf-profile.self.cycles-pp.kfree
      0.32 ±  2%      +0.0        0.36 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.20 ±  3%      +0.1        1.32 ±  2%  perf-profile.self.cycles-pp.__poll
      0.76 ±  2%      +0.1        0.89 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      4.88            +0.1        5.00        perf-profile.self.cycles-pp.do_sys_poll
      3.10            +0.2        3.28        perf-profile.self.cycles-pp.entry_SYSCALL_64
      4.18            +0.2        4.37        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      4.73            +0.2        4.94        perf-profile.self.cycles-pp.testcase
      6.16            +0.2        6.40        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      7.30            +0.3        7.62        perf-profile.self.cycles-pp.syscall_return_via_sysret




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ