[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202406281308.6137dbb1-oliver.sang@intel.com>
Date: Fri, 28 Jun 2024 13:13:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Jan Kara <jack@...e.cz>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Trond Myklebust <trond.myklebust@...merspace.com>,
<linux-nfs@...r.kernel.org>, <ying.huang@...el.com>, <feng.tang@...el.com>,
<fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [nfs] a527c3ba41: filebench.sum_operations/s 180.4%
improvement
Hello,
kernel test robot noticed a 180.4% improvement of filebench.sum_operations/s on:
commit: a527c3ba41c4c61e2069bfce4091e5515f06a8dd ("nfs: Avoid flushing many pages with NFS_FILE_SYNC")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: filebench
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:
disk: 1HDD
fs: btrfs
fs2: nfsv4
test: filemicro_rwritefsync.f
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240628/202406281308.6137dbb1-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-13/performance/1HDD/nfsv4/btrfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/filemicro_rwritefsync.f/filebench
commit:
134d0b3f24 ("nfs: propagate readlink errors in nfs_symlink_filler")
a527c3ba41 ("nfs: Avoid flushing many pages with NFS_FILE_SYNC")
134d0b3f2440cddd a527c3ba41c4c61e2069bfce409
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.06 -32.0% 1.40 ± 3% iostat.cpu.iowait
7.17e+10 ± 3% -62.2% 2.708e+10 ± 2% cpuidle..time
3361646 -40.5% 2001605 cpuidle..usage
797.57 ± 3% -58.3% 332.24 ± 2% uptime.boot
74461 ± 3% -58.5% 30930 ± 2% uptime.idle
986.05 ± 52% -100.0% 0.00 numa-meminfo.node0.Mlocked
41610 ± 32% -59.2% 16976 ± 79% numa-meminfo.node0.Shmem
64815 ± 3% -17.0% 53823 ± 2% numa-meminfo.node1.Active(anon)
989020 ± 10% -49.7% 497288 ± 49% numa-numastat.node0.local_node
1031591 ± 10% -46.8% 549069 ± 43% numa-numastat.node0.numa_hit
1104745 ± 11% -29.8% 775663 ± 28% numa-numastat.node1.local_node
1161905 ± 9% -29.1% 823337 ± 26% numa-numastat.node1.numa_hit
2170 ± 3% +91.4% 4154 ± 2% vmstat.io.bo
1.99 -32.0% 1.35 ± 3% vmstat.procs.b
2060 +23.5% 2543 ± 2% vmstat.system.cs
4540 ± 2% +80.5% 8197 ± 2% vmstat.system.in
2.07 -0.7 1.41 ± 3% mpstat.cpu.all.iowait%
0.06 ± 3% +0.1 0.15 ± 3% mpstat.cpu.all.irq%
0.01 ± 2% +0.0 0.02 ± 2% mpstat.cpu.all.soft%
0.05 ± 6% +0.0 0.07 ± 5% mpstat.cpu.all.sys%
0.05 ± 2% +0.1 0.12 ± 2% mpstat.cpu.all.usr%
0.37 ± 10% -0.1 0.30 ± 10% perf-profile.children.cycles-pp.perf_event_task_tick
0.15 ± 16% -0.0 0.11 ± 17% perf-profile.children.cycles-pp.rcu_core
0.16 ± 13% +0.1 0.21 ± 5% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.24 ± 12% -0.1 0.19 ± 11% perf-profile.self.cycles-pp.perf_event_task_tick
0.14 ± 15% +0.0 0.19 ± 10% perf-profile.self.cycles-pp.cpuidle_governor_latency_req
2.10 +184.9% 5.98 ± 6% filebench.sum_bytes_mb/s
273.04 +180.4% 765.61 ± 6% filebench.sum_operations/s
0.00 ± 10% +27680.8% 1.20 ± 7% filebench.sum_time_ms/op
273.00 +180.4% 765.50 ± 6% filebench.sum_writes/s
746.84 ± 3% -62.3% 281.62 ± 2% filebench.time.elapsed_time
746.84 ± 3% -62.3% 281.62 ± 2% filebench.time.elapsed_time.max
246.64 ± 52% -100.0% 0.00 numa-vmstat.node0.nr_mlock
10402 ± 32% -59.2% 4243 ± 79% numa-vmstat.node0.nr_shmem
1031364 ± 10% -46.8% 548226 ± 43% numa-vmstat.node0.numa_hit
988793 ± 10% -49.8% 496445 ± 49% numa-vmstat.node0.numa_local
16202 ± 3% -17.0% 13454 ± 2% numa-vmstat.node1.nr_active_anon
16202 ± 3% -17.0% 13454 ± 2% numa-vmstat.node1.nr_zone_active_anon
1161422 ± 9% -29.2% 822014 ± 26% numa-vmstat.node1.numa_hit
1104280 ± 11% -29.9% 774340 ± 28% numa-vmstat.node1.numa_local
169724 -34.1% 111875 meminfo.Active
71034 -19.6% 57108 meminfo.Active(anon)
98690 -44.5% 54766 ± 2% meminfo.Active(file)
386512 ± 18% -46.6% 206514 ± 22% meminfo.AnonHugePages
100539 ± 4% +163.6% 264992 ± 2% meminfo.Dirty
67198 -12.6% 58722 meminfo.Mapped
1426 ± 2% -100.0% 0.00 meminfo.Mlocked
113320 -20.9% 89605 meminfo.Shmem
295425 ± 4% +125.3% 665456 meminfo.Writeback
17758 -19.6% 14279 proc-vmstat.nr_active_anon
24673 -44.5% 13701 proc-vmstat.nr_active_file
165207 -2.3% 161474 proc-vmstat.nr_anon_pages
188.72 ± 18% -46.6% 100.85 ± 22% proc-vmstat.nr_anon_transparent_hugepages
641612 -8.4% 587844 proc-vmstat.nr_dirtied
25122 ± 4% +163.5% 66189 ± 2% proc-vmstat.nr_dirty
1359330 -2.5% 1325284 proc-vmstat.nr_file_pages
174858 -3.5% 168725 proc-vmstat.nr_inactive_anon
523188 -3.3% 506043 proc-vmstat.nr_inactive_file
18536 +3.8% 19247 proc-vmstat.nr_kernel_stack
17058 -12.4% 14939 proc-vmstat.nr_mapped
356.48 ± 2% -100.0% 0.00 proc-vmstat.nr_mlock
28336 -20.9% 22408 proc-vmstat.nr_shmem
73898 ± 4% +125.0% 166281 proc-vmstat.nr_writeback
640947 -8.4% 587183 proc-vmstat.nr_written
17758 -19.6% 14279 proc-vmstat.nr_zone_active_anon
24673 -44.5% 13701 proc-vmstat.nr_zone_active_file
174858 -3.5% 168725 proc-vmstat.nr_zone_inactive_anon
523188 -3.3% 506043 proc-vmstat.nr_zone_inactive_file
41988 ± 3% +100.4% 84132 ± 2% proc-vmstat.nr_zone_write_pending
2195708 ± 5% -37.4% 1375336 ± 6% proc-vmstat.numa_hit
2095965 ± 5% -39.2% 1274986 ± 7% proc-vmstat.numa_local
46641 -13.7% 40252 proc-vmstat.pgactivate
2637615 ± 4% -32.5% 1780826 ± 5% proc-vmstat.pgalloc_normal
1924711 ± 3% -56.3% 841690 ± 3% proc-vmstat.pgfault
2504198 ± 7% -32.5% 1691266 ± 13% proc-vmstat.pgfree
1624850 -27.2% 1182486 proc-vmstat.pgpgout
89895 ± 2% -55.4% 40062 ± 5% proc-vmstat.pgreuse
2.43 +7.0% 2.60 ± 3% perf-stat.i.MPKI
67435645 ± 2% +112.8% 1.435e+08 ± 2% perf-stat.i.branch-instructions
4.56 -0.1 4.44 perf-stat.i.branch-miss-rate%
3862446 ± 2% +125.0% 8689890 ± 3% perf-stat.i.branch-misses
4.97 +2.4 7.33 ± 2% perf-stat.i.cache-miss-rate%
540701 ± 3% +86.6% 1009040 ± 2% perf-stat.i.cache-misses
7966602 +24.1% 9887537 perf-stat.i.cache-references
2039 +22.7% 2502 ± 2% perf-stat.i.context-switches
4.97e+08 ± 2% +91.0% 9.495e+08 ± 3% perf-stat.i.cpu-cycles
101.96 +4.2% 106.25 perf-stat.i.cpu-migrations
1037 +10.6% 1147 ± 3% perf-stat.i.cycles-between-cache-misses
3.314e+08 ± 2% +112.2% 7.033e+08 ± 2% perf-stat.i.instructions
0.50 +11.5% 0.56 perf-stat.i.ipc
2.11 -99.2% 0.02 ± 9% perf-stat.i.metric.K/sec
2466 +12.6% 2776 ± 2% perf-stat.i.minor-faults
2466 +12.6% 2776 ± 2% perf-stat.i.page-faults
1.63 ± 3% -12.0% 1.43 ± 3% perf-stat.overall.MPKI
5.73 +0.3 6.05 perf-stat.overall.branch-miss-rate%
6.79 ± 3% +3.4 10.21 ± 2% perf-stat.overall.cache-miss-rate%
1.50 -10.0% 1.35 perf-stat.overall.cpi
0.67 +11.1% 0.74 perf-stat.overall.ipc
67362570 ± 2% +112.3% 1.43e+08 ± 2% perf-stat.ps.branch-instructions
3858126 ± 2% +124.5% 8659606 ± 3% perf-stat.ps.branch-misses
539904 ± 3% +86.2% 1005142 ± 2% perf-stat.ps.cache-misses
7952547 +23.8% 9844369 perf-stat.ps.cache-references
2036 +22.5% 2494 ± 2% perf-stat.ps.context-switches
4.966e+08 ± 2% +90.7% 9.468e+08 ± 3% perf-stat.ps.cpu-cycles
101.81 +4.0% 105.85 perf-stat.ps.cpu-migrations
3.311e+08 ± 2% +111.7% 7.01e+08 ± 2% perf-stat.ps.instructions
2461 +12.2% 2762 ± 2% perf-stat.ps.minor-faults
2461 +12.2% 2762 ± 2% perf-stat.ps.page-faults
2.475e+11 -20.0% 1.98e+11 perf-stat.total.instructions
0.04 ± 4% +31.6% 0.05 ± 8% sched_debug.cfs_rq:/.h_nr_running.avg
20.10 ± 14% +60.4% 32.25 ± 20% sched_debug.cfs_rq:/.load_avg.avg
0.04 ± 3% +31.7% 0.05 ± 8% sched_debug.cfs_rq:/.nr_running.avg
7.67 ± 37% +146.1% 18.87 ± 29% sched_debug.cfs_rq:/.removed.load_avg.avg
3.51 ± 42% +138.3% 8.37 ± 29% sched_debug.cfs_rq:/.removed.runnable_avg.avg
3.51 ± 42% +138.3% 8.37 ± 29% sched_debug.cfs_rq:/.removed.util_avg.avg
38.15 ± 6% +101.4% 76.85 ± 8% sched_debug.cfs_rq:/.runnable_avg.avg
98.16 ± 5% +39.5% 136.95 ± 11% sched_debug.cfs_rq:/.runnable_avg.stddev
37.92 ± 6% +101.5% 76.42 ± 7% sched_debug.cfs_rq:/.util_avg.avg
656.80 ± 4% +18.8% 780.15 ± 15% sched_debug.cfs_rq:/.util_avg.max
97.57 ± 5% +39.9% 136.52 ± 11% sched_debug.cfs_rq:/.util_avg.stddev
3.28 ± 25% +98.3% 6.50 ± 40% sched_debug.cfs_rq:/.util_est.avg
123.73 ± 11% +45.4% 179.95 ± 12% sched_debug.cfs_rq:/.util_est.max
17.32 ± 11% +68.8% 29.24 ± 22% sched_debug.cfs_rq:/.util_est.stddev
389566 ± 7% -57.8% 164509 ± 7% sched_debug.cpu.clock.avg
389580 ± 7% -57.8% 164520 ± 7% sched_debug.cpu.clock.max
389555 ± 7% -57.8% 164499 ± 7% sched_debug.cpu.clock.min
8.38 ± 16% -27.1% 6.11 ± 12% sched_debug.cpu.clock.stddev
388964 ± 7% -57.8% 164063 ± 7% sched_debug.cpu.clock_task.avg
389309 ± 7% -57.8% 164329 ± 7% sched_debug.cpu.clock_task.max
381467 ± 7% -58.9% 156750 ± 7% sched_debug.cpu.clock_task.min
12392 ± 5% -46.0% 6695 ± 4% sched_debug.cpu.curr->pid.max
1368 ± 6% -36.2% 872.62 ± 4% sched_debug.cpu.curr->pid.stddev
0.03 ± 10% +52.8% 0.04 ± 10% sched_debug.cpu.nr_running.avg
0.15 ± 7% +16.5% 0.17 ± 5% sched_debug.cpu.nr_running.stddev
9004 ± 5% -47.9% 4694 ± 6% sched_debug.cpu.nr_switches.avg
77261 ± 22% -40.5% 46007 ± 9% sched_debug.cpu.nr_switches.max
1542 ± 6% -54.4% 702.61 ± 8% sched_debug.cpu.nr_switches.min
10459 ± 10% -40.6% 6217 ± 6% sched_debug.cpu.nr_switches.stddev
0.07 ± 5% -73.2% 0.02 ± 17% sched_debug.cpu.nr_uninterruptible.avg
389570 ± 7% -57.8% 164510 ± 7% sched_debug.cpu_clk
388998 ± 7% -57.9% 163938 ± 7% sched_debug.ktime
390127 ± 7% -57.7% 165072 ± 7% sched_debug.sched_clk
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists