lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202411261616.c29946d8-lkp@intel.com>
Date: Tue, 26 Nov 2024 16:44:23 +0800
From: kernel test robot <oliver.sang@...el.com>
To: David Howells <dhowells@...hat.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Christian Brauner <brauner@...nel.org>, Steve French <sfrench@...ba.org>,
	Paulo Alcantara <pc@...guebit.com>, Trond Myklebust <trondmy@...nel.org>,
	Jeff Layton <jlayton@...nel.org>, <netfs@...ts.linux.dev>,
	<linux-fsdevel@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [netfs]  d6a77668a7:  filebench.sum_operations/s
 158.3% improvement



Hello,

kernel test robot noticed a 158.3% improvement of filebench.sum_operations/s on:


commit: d6a77668a708f0b5ca6713b39c178c9d9563c35b ("netfs: Downgrade i_rwsem for a buffered write")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: filebench
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:

	disk: 1HDD
	fs: xfs
	fs2: cifs
	test: randomrw.f
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241126/202411261616.c29946d8-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/performance/1HDD/cifs/xfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/randomrw.f/filebench

commit: 
  6ed469df0b ("nilfs2: fix kernel bug due to missing clearing of buffer delay flag")
  d6a77668a7 ("netfs: Downgrade i_rwsem for a buffered write")

6ed469df0bfbef3e d6a77668a708f0b5ca6713b39c1 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  10356023 ± 13%     -88.4%    1203898 ±  8%  cpuidle..usage
      1862 ± 17%     -45.6%       1013 ± 23%  perf-c2c.HITM.local
    564994 ±  9%     -86.4%      76928 ± 36%  numa-meminfo.node1.Active(anon)
    585171 ±  7%     -84.9%      88374 ± 38%  numa-meminfo.node1.Shmem
    124475 ± 13%     -92.9%       8821 ± 14%  vmstat.system.cs
      9926 ±  6%     -39.6%       5995 ±  4%  vmstat.system.in
    576365 ± 10%     -83.0%      98054 ± 27%  meminfo.Active(anon)
   1481440 ±  4%     -33.1%     991806 ±  2%  meminfo.Committed_AS
    613566 ± 10%     -79.8%     124007 ± 22%  meminfo.Shmem
      0.02 ±  3%      -0.0        0.02 ±  4%  mpstat.cpu.all.irq%
      0.60 ±  2%      +0.1        0.69        mpstat.cpu.all.sys%
      0.18            +0.0        0.22 ±  6%  mpstat.cpu.all.usr%
    141224 ±  9%     -86.4%      19203 ± 36%  numa-vmstat.node1.nr_active_anon
    146313 ±  7%     -84.9%      22087 ± 38%  numa-vmstat.node1.nr_shmem
    141224 ±  9%     -86.4%      19203 ± 36%  numa-vmstat.node1.nr_zone_active_anon
     91197 ± 22%     -93.7%       5768 ± 19%  sched_debug.cpu.nr_switches.avg
   6021808 ± 30%     -96.1%     232641 ± 32%  sched_debug.cpu.nr_switches.max
    616189 ± 24%     -95.9%      25525 ± 31%  sched_debug.cpu.nr_switches.stddev
    144168 ± 10%     -83.0%      24516 ± 27%  proc-vmstat.nr_active_anon
   3501815            -3.8%    3369305        proc-vmstat.nr_file_pages
     28035            -5.9%      26386        proc-vmstat.nr_mapped
    153431 ± 10%     -79.8%      31026 ± 22%  proc-vmstat.nr_shmem
     25506            -1.6%      25092        proc-vmstat.nr_slab_reclaimable
    144168 ± 10%     -83.0%      24516 ± 27%  proc-vmstat.nr_zone_active_anon
   1443064            -7.1%    1340212        proc-vmstat.pgactivate
      2557 ± 14%    +158.3%       6606 ± 10%  filebench.sum_bytes_mb/s
  19644866 ± 14%    +158.3%   50742596 ± 10%  filebench.sum_operations
    327385 ± 14%    +158.3%     845638 ± 10%  filebench.sum_operations/s
    163882 ± 14%    +189.5%     474419 ± 12%  filebench.sum_reads/s
      0.01 ± 15%     -65.7%       0.00        filebench.sum_time_ms/op
    163502 ± 14%    +127.0%     371220 ±  9%  filebench.sum_writes/s
     56.83           +29.0%      73.33        filebench.time.percent_of_cpu_this_job_got
     85.87 ±  2%     +20.1%     103.10 ±  2%  filebench.time.system_time
      8.54 ± 10%    +115.4%      18.39 ± 16%  filebench.time.user_time
   9795275 ± 14%     -99.3%      67709 ± 70%  filebench.time.voluntary_context_switches
      0.01 ± 29%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.01 ± 19%    -100.0%       0.00        perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.00 ± 67%    +469.2%       0.01 ± 12%  perf-sched.total_sch_delay.average.ms
      1.33 ± 13%    +975.0%      14.30 ± 33%  perf-sched.total_wait_and_delay.average.ms
    724911 ± 10%     -89.6%      75232 ± 36%  perf-sched.total_wait_and_delay.count.ms
      1.33 ± 13%    +976.1%      14.29 ± 33%  perf-sched.total_wait_time.average.ms
      3.47 ± 11%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     54.35 ±  8%    +403.1%     273.44 ± 19%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
     19.50 ± 30%    -100.0%       0.00        perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    280.83 ± 12%     -79.1%      58.83 ± 24%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
    649458 ± 10%     -99.1%       6085 ± 56%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
      4.62 ± 11%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      1001           +25.4%       1254 ± 17%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.01 ± 22%    -100.0%       0.00        perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
     54.34 ±  8%    +403.2%     273.41 ± 19%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.01 ± 19%    -100.0%       0.00        perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      1001           +25.4%       1254 ± 17%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.15 ± 44%     -69.5%       0.05 ± 28%  perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
      3.23 ±100%      -1.0        2.18 ±142%  perf-profile.calltrace.cycles-pp.cmd_stat
      3.23 ±100%      -1.0        2.18 ±142%  perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat
      3.22 ±100%      -1.0        2.17 ±141%  perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat
      3.12 ±100%      -1.0        2.12 ±142%  perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat
      0.42 ± 34%      -0.2        0.24 ± 28%  perf-profile.children.cycles-pp.perf_iterate_sb
      0.42 ± 22%      -0.1        0.28 ± 22%  perf-profile.children.cycles-pp.set_pte_range
      0.11 ± 38%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      0.11 ± 56%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.read@plt
      0.02 ±141%      +0.1        0.12 ± 29%  perf-profile.children.cycles-pp.aa_file_perm
      0.07 ± 55%      +0.1        0.17 ± 29%  perf-profile.children.cycles-pp.fault_in_iov_iter_readable
      0.07 ± 55%      +0.1        0.17 ± 29%  perf-profile.children.cycles-pp.fault_in_readable
      0.09 ± 50%      +0.1        0.22 ± 28%  perf-profile.children.cycles-pp.getenv
      0.21 ± 30%      +0.2        0.37 ± 34%  perf-profile.children.cycles-pp.__perf_read_group_add
      0.19 ± 44%      +0.2        0.36 ± 34%  perf-profile.children.cycles-pp.pcpu_alloc_noprof
      0.82 ± 11%      +0.4        1.20 ± 13%  perf-profile.children.cycles-pp.sched_balance_update_blocked_averages
      0.11 ± 38%      -0.1        0.04 ± 71%  perf-profile.self.cycles-pp.copy_page_from_iter_atomic
      0.11 ± 56%      -0.1        0.04 ± 71%  perf-profile.self.cycles-pp.read@plt
      0.02 ±141%      +0.1        0.12 ± 29%  perf-profile.self.cycles-pp.aa_file_perm
      0.02 ±141%      +0.1        0.12 ± 31%  perf-profile.self.cycles-pp.getenv
      5.49 ±  4%     +87.1%      10.27 ±  4%  perf-stat.i.MPKI
 6.113e+08 ±  6%     -21.6%  4.793e+08 ±  5%  perf-stat.i.branch-instructions
  12875097            -9.6%   11640297        perf-stat.i.branch-misses
  26605878 ±  8%     +61.4%   42952527 ±  5%  perf-stat.i.cache-misses
  89659393 ±  6%     +53.9%   1.38e+08 ±  6%  perf-stat.i.cache-references
    126410 ± 13%     -93.0%       8884 ± 15%  perf-stat.i.context-switches
      1.85 ±  2%      +8.3%       2.00 ±  2%  perf-stat.i.cpi
 2.757e+09 ±  6%     -17.8%  2.265e+09 ±  4%  perf-stat.i.instructions
      0.58 ±  2%      -8.1%       0.53 ±  2%  perf-stat.i.ipc
      1.00 ± 13%     -97.3%       0.03 ± 57%  perf-stat.i.metric.K/sec
      9.63 ±  4%     +96.7%      18.95 ±  2%  perf-stat.overall.MPKI
      2.11 ±  5%      +0.3        2.42 ±  5%  perf-stat.overall.branch-miss-rate%
      1.56 ±  6%     +19.7%       1.86 ±  4%  perf-stat.overall.cpi
    161.89 ±  7%     -39.3%      98.29 ±  4%  perf-stat.overall.cycles-between-cache-misses
      0.65 ±  5%     -16.5%       0.54 ±  5%  perf-stat.overall.ipc
 6.088e+08 ±  6%     -21.3%  4.791e+08 ±  5%  perf-stat.ps.branch-instructions
  12794450            -9.6%   11566995        perf-stat.ps.branch-misses
  26464019 ±  8%     +62.1%   42902925 ±  4%  perf-stat.ps.cache-misses
  89144844 ±  7%     +54.5%  1.378e+08 ±  6%  perf-stat.ps.cache-references
    126023 ± 13%     -93.0%       8808 ± 15%  perf-stat.ps.context-switches
 2.746e+09 ±  6%     -17.5%  2.264e+09 ±  4%  perf-stat.ps.instructions
 4.542e+11 ±  6%     -17.4%  3.753e+11 ±  4%  perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ