[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202411261616.c29946d8-lkp@intel.com>
Date: Tue, 26 Nov 2024 16:44:23 +0800
From: kernel test robot <oliver.sang@...el.com>
To: David Howells <dhowells@...hat.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Christian Brauner <brauner@...nel.org>, Steve French <sfrench@...ba.org>,
Paulo Alcantara <pc@...guebit.com>, Trond Myklebust <trondmy@...nel.org>,
Jeff Layton <jlayton@...nel.org>, <netfs@...ts.linux.dev>,
<linux-fsdevel@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [netfs] d6a77668a7: filebench.sum_operations/s
158.3% improvement
Hello,
kernel test robot noticed a 158.3% improvement of filebench.sum_operations/s on:
commit: d6a77668a708f0b5ca6713b39c178c9d9563c35b ("netfs: Downgrade i_rwsem for a buffered write")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: filebench
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:
disk: 1HDD
fs: xfs
fs2: cifs
test: randomrw.f
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241126/202411261616.c29946d8-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/1HDD/cifs/xfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/randomrw.f/filebench
commit:
6ed469df0b ("nilfs2: fix kernel bug due to missing clearing of buffer delay flag")
d6a77668a7 ("netfs: Downgrade i_rwsem for a buffered write")
6ed469df0bfbef3e d6a77668a708f0b5ca6713b39c1
---------------- ---------------------------
%stddev %change %stddev
\ | \
10356023 ± 13% -88.4% 1203898 ± 8% cpuidle..usage
1862 ± 17% -45.6% 1013 ± 23% perf-c2c.HITM.local
564994 ± 9% -86.4% 76928 ± 36% numa-meminfo.node1.Active(anon)
585171 ± 7% -84.9% 88374 ± 38% numa-meminfo.node1.Shmem
124475 ± 13% -92.9% 8821 ± 14% vmstat.system.cs
9926 ± 6% -39.6% 5995 ± 4% vmstat.system.in
576365 ± 10% -83.0% 98054 ± 27% meminfo.Active(anon)
1481440 ± 4% -33.1% 991806 ± 2% meminfo.Committed_AS
613566 ± 10% -79.8% 124007 ± 22% meminfo.Shmem
0.02 ± 3% -0.0 0.02 ± 4% mpstat.cpu.all.irq%
0.60 ± 2% +0.1 0.69 mpstat.cpu.all.sys%
0.18 +0.0 0.22 ± 6% mpstat.cpu.all.usr%
141224 ± 9% -86.4% 19203 ± 36% numa-vmstat.node1.nr_active_anon
146313 ± 7% -84.9% 22087 ± 38% numa-vmstat.node1.nr_shmem
141224 ± 9% -86.4% 19203 ± 36% numa-vmstat.node1.nr_zone_active_anon
91197 ± 22% -93.7% 5768 ± 19% sched_debug.cpu.nr_switches.avg
6021808 ± 30% -96.1% 232641 ± 32% sched_debug.cpu.nr_switches.max
616189 ± 24% -95.9% 25525 ± 31% sched_debug.cpu.nr_switches.stddev
144168 ± 10% -83.0% 24516 ± 27% proc-vmstat.nr_active_anon
3501815 -3.8% 3369305 proc-vmstat.nr_file_pages
28035 -5.9% 26386 proc-vmstat.nr_mapped
153431 ± 10% -79.8% 31026 ± 22% proc-vmstat.nr_shmem
25506 -1.6% 25092 proc-vmstat.nr_slab_reclaimable
144168 ± 10% -83.0% 24516 ± 27% proc-vmstat.nr_zone_active_anon
1443064 -7.1% 1340212 proc-vmstat.pgactivate
2557 ± 14% +158.3% 6606 ± 10% filebench.sum_bytes_mb/s
19644866 ± 14% +158.3% 50742596 ± 10% filebench.sum_operations
327385 ± 14% +158.3% 845638 ± 10% filebench.sum_operations/s
163882 ± 14% +189.5% 474419 ± 12% filebench.sum_reads/s
0.01 ± 15% -65.7% 0.00 filebench.sum_time_ms/op
163502 ± 14% +127.0% 371220 ± 9% filebench.sum_writes/s
56.83 +29.0% 73.33 filebench.time.percent_of_cpu_this_job_got
85.87 ± 2% +20.1% 103.10 ± 2% filebench.time.system_time
8.54 ± 10% +115.4% 18.39 ± 16% filebench.time.user_time
9795275 ± 14% -99.3% 67709 ± 70% filebench.time.voluntary_context_switches
0.01 ± 29% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.01 ± 19% -100.0% 0.00 perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.00 ± 67% +469.2% 0.01 ± 12% perf-sched.total_sch_delay.average.ms
1.33 ± 13% +975.0% 14.30 ± 33% perf-sched.total_wait_and_delay.average.ms
724911 ± 10% -89.6% 75232 ± 36% perf-sched.total_wait_and_delay.count.ms
1.33 ± 13% +976.1% 14.29 ± 33% perf-sched.total_wait_time.average.ms
3.47 ± 11% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
54.35 ± 8% +403.1% 273.44 ± 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
19.50 ± 30% -100.0% 0.00 perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
280.83 ± 12% -79.1% 58.83 ± 24% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
649458 ± 10% -99.1% 6085 ± 56% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
4.62 ± 11% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
1001 +25.4% 1254 ± 17% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.01 ± 22% -100.0% 0.00 perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
54.34 ± 8% +403.2% 273.41 ± 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.01 ± 19% -100.0% 0.00 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
1001 +25.4% 1254 ± 17% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.15 ± 44% -69.5% 0.05 ± 28% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
3.23 ±100% -1.0 2.18 ±142% perf-profile.calltrace.cycles-pp.cmd_stat
3.23 ±100% -1.0 2.18 ±142% perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat
3.22 ±100% -1.0 2.17 ±141% perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat
3.12 ±100% -1.0 2.12 ±142% perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat
0.42 ± 34% -0.2 0.24 ± 28% perf-profile.children.cycles-pp.perf_iterate_sb
0.42 ± 22% -0.1 0.28 ± 22% perf-profile.children.cycles-pp.set_pte_range
0.11 ± 38% -0.1 0.04 ± 71% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
0.11 ± 56% -0.1 0.04 ± 71% perf-profile.children.cycles-pp.read@plt
0.02 ±141% +0.1 0.12 ± 29% perf-profile.children.cycles-pp.aa_file_perm
0.07 ± 55% +0.1 0.17 ± 29% perf-profile.children.cycles-pp.fault_in_iov_iter_readable
0.07 ± 55% +0.1 0.17 ± 29% perf-profile.children.cycles-pp.fault_in_readable
0.09 ± 50% +0.1 0.22 ± 28% perf-profile.children.cycles-pp.getenv
0.21 ± 30% +0.2 0.37 ± 34% perf-profile.children.cycles-pp.__perf_read_group_add
0.19 ± 44% +0.2 0.36 ± 34% perf-profile.children.cycles-pp.pcpu_alloc_noprof
0.82 ± 11% +0.4 1.20 ± 13% perf-profile.children.cycles-pp.sched_balance_update_blocked_averages
0.11 ± 38% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
0.11 ± 56% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.read@plt
0.02 ±141% +0.1 0.12 ± 29% perf-profile.self.cycles-pp.aa_file_perm
0.02 ±141% +0.1 0.12 ± 31% perf-profile.self.cycles-pp.getenv
5.49 ± 4% +87.1% 10.27 ± 4% perf-stat.i.MPKI
6.113e+08 ± 6% -21.6% 4.793e+08 ± 5% perf-stat.i.branch-instructions
12875097 -9.6% 11640297 perf-stat.i.branch-misses
26605878 ± 8% +61.4% 42952527 ± 5% perf-stat.i.cache-misses
89659393 ± 6% +53.9% 1.38e+08 ± 6% perf-stat.i.cache-references
126410 ± 13% -93.0% 8884 ± 15% perf-stat.i.context-switches
1.85 ± 2% +8.3% 2.00 ± 2% perf-stat.i.cpi
2.757e+09 ± 6% -17.8% 2.265e+09 ± 4% perf-stat.i.instructions
0.58 ± 2% -8.1% 0.53 ± 2% perf-stat.i.ipc
1.00 ± 13% -97.3% 0.03 ± 57% perf-stat.i.metric.K/sec
9.63 ± 4% +96.7% 18.95 ± 2% perf-stat.overall.MPKI
2.11 ± 5% +0.3 2.42 ± 5% perf-stat.overall.branch-miss-rate%
1.56 ± 6% +19.7% 1.86 ± 4% perf-stat.overall.cpi
161.89 ± 7% -39.3% 98.29 ± 4% perf-stat.overall.cycles-between-cache-misses
0.65 ± 5% -16.5% 0.54 ± 5% perf-stat.overall.ipc
6.088e+08 ± 6% -21.3% 4.791e+08 ± 5% perf-stat.ps.branch-instructions
12794450 -9.6% 11566995 perf-stat.ps.branch-misses
26464019 ± 8% +62.1% 42902925 ± 4% perf-stat.ps.cache-misses
89144844 ± 7% +54.5% 1.378e+08 ± 6% perf-stat.ps.cache-references
126023 ± 13% -93.0% 8808 ± 15% perf-stat.ps.context-switches
2.746e+09 ± 6% -17.5% 2.264e+09 ± 4% perf-stat.ps.instructions
4.542e+11 ± 6% -17.4% 3.753e+11 ± 4% perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists