[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220414142720.GC6935@xsang-OptiPlex-9020>
Date: Thu, 14 Apr 2022 22:27:20 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Christoph Hellwig <hch@....de>
Cc: Christoph Hellwig <hch@....de>, lkp@...ts.01.org, lkp@...el.com,
ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com,
LKML <linux-kernel@...r.kernel.org>
Subject: [block] 70bed0d544: fsmark.files_per_sec 92.0% improvement
Greeting,
FYI, we noticed a 92.0% improvement of fsmark.files_per_sec due to commit:
commit: 70bed0d5447e08702c7595d26c88ca37e8eb88b4 ("block: add a bdev_write_cache helper")
git://git.infradead.org/users/hch/block.git block-api
in testcase: fsmark
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
with following parameters:
iterations: 1x
nr_threads: 1t
disk: 1HDD
fs: btrfs
fs2: nfsv4
filesize: 4K
test_size: 40M
sync_method: fsyncBeforeClose
nr_files_per_directory: 1fpd
cpufreq_governor: performance
ucode: 0xd000331
test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
gcc-11/performance/1HDD/4K/nfsv4/btrfs/1x/x86_64-rhel-8.3/1fpd/1t/debian-10.4-x86_64-20200603.cgz/fsyncBeforeClose/lkp-icl-2sp6/40M/fsmark/0xd000331
commit:
6cccbfebc0 ("block: add a bdev_nonrot helper")
70bed0d544 ("block: add a bdev_write_cache helper")
6cccbfebc02395ae 70bed0d5447e08702c7595d26c8
---------------- ---------------------------
%stddev %change %stddev
\ | \
19.10 +92.0% 36.67 fsmark.files_per_sec
536.13 -47.9% 279.40 fsmark.time.elapsed_time
536.13 -47.9% 279.40 fsmark.time.elapsed_time.max
53273 +2.7% 54708 fsmark.time.voluntary_context_switches
1.49 -2.1% 1.46 iostat.cpu.iowait
908369 ± 17% -39.4% 550808 ± 28% numa-numastat.node1.numa_hit
6.694e+10 -48.0% 3.482e+10 cpuidle..time
1.385e+08 -47.7% 72505602 cpuidle..usage
0.03 +0.0 0.04 ± 3% mpstat.cpu.all.sys%
0.01 ± 3% +0.0 0.01 ± 4% mpstat.cpu.all.usr%
577.53 -44.4% 321.22 uptime.boot
70827 -44.7% 39155 uptime.idle
2334 +102.7% 4732 vmstat.io.bo
3380 +45.5% 4919 vmstat.system.cs
1.38e+08 -47.7% 72098570 turbostat.IRQ
22732 ± 12% -38.8% 13910 ± 6% turbostat.POLL
51.67 -3.9% 49.67 ± 2% turbostat.PkgTmp
134519 +15.4% 155275 meminfo.Active
10873 ± 3% -32.7% 7312 meminfo.Active(anon)
123645 +19.7% 147962 meminfo.Active(file)
29545 -12.3% 25909 meminfo.Shmem
256478 -36.6% 162537 ± 39% numa-meminfo.node0.AnonHugePages
7918 ± 30% -55.5% 3522 ± 9% numa-meminfo.node1.Active(anon)
20189 ± 46% +484.8% 118058 ± 66% numa-meminfo.node1.AnonPages
55896 ± 34% +176.2% 154400 ± 47% numa-meminfo.node1.AnonPages.max
25261 ± 31% +383.3% 122094 ± 64% numa-meminfo.node1.Inactive(anon)
1467 ± 16% +26.8% 1860 ± 11% numa-meminfo.node1.PageTables
12916 ± 22% -45.2% 7081 ± 55% numa-meminfo.node1.Shmem
1978 ± 30% -55.5% 880.00 ± 9% numa-vmstat.node1.nr_active_anon
5049 ± 46% +484.5% 29514 ± 66% numa-vmstat.node1.nr_anon_pages
6319 ± 31% +383.1% 30528 ± 64% numa-vmstat.node1.nr_inactive_anon
366.00 ± 17% +26.8% 464.17 ± 10% numa-vmstat.node1.nr_page_table_pages
3231 ± 22% -45.1% 1773 ± 55% numa-vmstat.node1.nr_shmem
1978 ± 30% -55.5% 880.00 ± 9% numa-vmstat.node1.nr_zone_active_anon
6319 ± 31% +383.1% 30528 ± 64% numa-vmstat.node1.nr_zone_inactive_anon
907485 ± 17% -39.2% 551338 ± 28% numa-vmstat.node1.numa_hit
3311 +42.4% 4714 perf-stat.i.context-switches
133.20 +1.8% 135.58 perf-stat.i.cpu-migrations
2.952e+08 +4.3% 3.078e+08 perf-stat.i.dTLB-loads
1.587e+08 +4.3% 1.655e+08 perf-stat.i.dTLB-stores
2945 +4.7% 3084 perf-stat.i.minor-faults
94.72 -1.8 92.97 perf-stat.i.node-load-miss-rate%
6976 ± 19% +65.2% 11527 ± 14% perf-stat.i.node-loads
56884 ± 12% +51.6% 86264 ± 6% perf-stat.i.node-stores
2946 +4.7% 3085 perf-stat.i.page-faults
92.90 -2.4 90.53 perf-stat.overall.node-load-miss-rate%
3305 +42.1% 4697 perf-stat.ps.context-switches
2.946e+08 +4.1% 3.067e+08 perf-stat.ps.dTLB-loads
1.584e+08 +4.1% 1.649e+08 perf-stat.ps.dTLB-stores
2939 +4.5% 3072 perf-stat.ps.minor-faults
6962 ± 19% +64.9% 11483 ± 14% perf-stat.ps.node-loads
56769 ± 12% +51.4% 85938 ± 6% perf-stat.ps.node-stores
2940 +4.5% 3073 perf-stat.ps.page-faults
5.8e+11 ± 3% -46.4% 3.106e+11 ± 4% perf-stat.total.instructions
2718 ± 3% -32.8% 1826 proc-vmstat.nr_active_anon
30918 +19.5% 36954 proc-vmstat.nr_active_file
82517 +2.3% 84385 proc-vmstat.nr_anon_pages
170379 +5.1% 179015 proc-vmstat.nr_dirtied
160.83 +32.7% 213.50 proc-vmstat.nr_dirty
87111 +2.3% 89076 proc-vmstat.nr_inactive_anon
9165 +1.9% 9340 proc-vmstat.nr_mapped
1104 +7.4% 1186 proc-vmstat.nr_page_table_pages
7386 -12.3% 6475 proc-vmstat.nr_shmem
170150 +5.0% 178704 proc-vmstat.nr_written
2718 ± 3% -32.8% 1826 proc-vmstat.nr_zone_active_anon
30918 +19.5% 36954 proc-vmstat.nr_zone_active_file
87111 +2.3% 89076 proc-vmstat.nr_zone_inactive_anon
161.33 +33.5% 215.33 proc-vmstat.nr_zone_write_pending
1722532 -29.5% 1214402 proc-vmstat.numa_hit
1606723 -31.6% 1098636 proc-vmstat.numa_local
1722459 -29.5% 1214419 proc-vmstat.pgalloc_normal
1723177 -42.0% 999350 proc-vmstat.pgfault
1598401 -32.6% 1077857 proc-vmstat.pgfree
1260337 +6.1% 1337822 proc-vmstat.pgpgout
145698 -44.0% 81595 proc-vmstat.pgreuse
34.69 ± 24% +42.8% 49.55 ± 16% sched_debug.cfs_rq:/.load_avg.avg
49.89 ± 7% +55.5% 77.57 ± 4% sched_debug.cfs_rq:/.runnable_avg.avg
633.30 ± 2% +15.3% 730.40 ± 7% sched_debug.cfs_rq:/.runnable_avg.max
116.72 ± 8% +25.6% 146.60 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev
49.75 ± 7% +55.3% 77.28 ± 4% sched_debug.cfs_rq:/.util_avg.avg
632.78 ± 2% +15.1% 728.53 ± 7% sched_debug.cfs_rq:/.util_avg.max
116.60 ± 8% +25.5% 146.31 ± 6% sched_debug.cfs_rq:/.util_avg.stddev
4.18 ± 22% +53.9% 6.44 ± 15% sched_debug.cfs_rq:/.util_est_enqueued.avg
178.60 ± 10% +39.6% 249.40 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.max
22.91 ± 16% +41.2% 32.34 ± 8% sched_debug.cfs_rq:/.util_est_enqueued.stddev
116236 ± 8% +24.6% 144804 ± 7% sched_debug.cpu.avg_idle.stddev
259878 ± 5% -38.2% 160679 sched_debug.cpu.clock.avg
259881 ± 5% -38.2% 160683 sched_debug.cpu.clock.max
259874 ± 5% -38.2% 160675 sched_debug.cpu.clock.min
1.97 ± 7% +14.2% 2.26 ± 9% sched_debug.cpu.clock.stddev
255028 ± 4% -38.2% 157678 sched_debug.cpu.clock_task.avg
255665 ± 5% -38.2% 158126 sched_debug.cpu.clock_task.max
249556 ± 5% -39.2% 151775 sched_debug.cpu.clock_task.min
11619 ± 3% -22.5% 9002 sched_debug.cpu.curr->pid.max
1173 ± 4% -9.7% 1059 ± 3% sched_debug.cpu.curr->pid.stddev
0.03 ± 7% +26.7% 0.03 ± 5% sched_debug.cpu.nr_running.avg
0.15 ± 2% +10.2% 0.16 ± 2% sched_debug.cpu.nr_running.stddev
8223 ± 4% -15.8% 6924 sched_debug.cpu.nr_switches.avg
1411 ± 9% -22.7% 1090 ± 13% sched_debug.cpu.nr_switches.min
259875 ± 5% -38.2% 160676 sched_debug.cpu_clk
259153 ± 5% -38.3% 159957 sched_debug.ktime
261040 ± 5% -38.2% 161334 sched_debug.sched_clk
53.97 ± 6% -6.4 47.54 ± 2% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
54.36 ± 6% -6.4 47.99 ± 2% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
84.36 -2.3 82.02 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
92.94 -2.0 90.93 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
85.53 -2.0 83.56 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
0.92 ± 11% +0.1 1.07 ± 4% perf-profile.calltrace.cycles-pp.rcu_idle_exit.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
0.29 ±100% +0.4 0.74 ± 10% perf-profile.calltrace.cycles-pp.rcu_core.__softirqentry_text_start.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.29 ±101% +0.5 0.81 ± 11% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
0.10 ±223% +0.5 0.64 ± 10% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
3.04 ± 8% +0.5 3.58 ± 10% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
1.18 ± 7% +0.9 2.05 ± 18% perf-profile.calltrace.cycles-pp.ret_from_fork
1.18 ± 7% +0.9 2.05 ± 18% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
28.17 ± 7% +3.8 31.99 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
54.25 ± 6% -6.4 47.90 ± 2% perf-profile.children.cycles-pp.mwait_idle_with_hints
54.65 ± 6% -6.3 48.32 ± 2% perf-profile.children.cycles-pp.intel_idle
86.06 -2.0 84.03 perf-profile.children.cycles-pp.cpuidle_enter_state
86.29 -2.0 84.28 perf-profile.children.cycles-pp.cpuidle_enter
93.82 -2.0 91.85 perf-profile.children.cycles-pp.cpuidle_idle_call
0.07 ± 21% +0.0 0.11 ± 12% perf-profile.children.cycles-pp.can_stop_idle_tick
0.05 ± 50% +0.0 0.09 ± 26% perf-profile.children.cycles-pp.mmap_region
0.04 ± 47% +0.0 0.09 ± 22% perf-profile.children.cycles-pp.call_transmit
0.04 ± 47% +0.0 0.09 ± 22% perf-profile.children.cycles-pp.xprt_transmit
0.06 ± 11% +0.0 0.11 ± 24% perf-profile.children.cycles-pp.process_backlog
0.06 ± 17% +0.0 0.11 ± 20% perf-profile.children.cycles-pp.__local_bh_enable_ip
0.04 ± 72% +0.0 0.09 ± 29% perf-profile.children.cycles-pp.handle_irq_event
0.04 ± 72% +0.0 0.09 ± 29% perf-profile.children.cycles-pp.__handle_irq_event_percpu
0.05 ± 45% +0.1 0.10 ± 20% perf-profile.children.cycles-pp.ip6_protocol_deliver_rcu
0.05 ± 45% +0.1 0.10 ± 20% perf-profile.children.cycles-pp.tcp_v6_rcv
0.04 ± 74% +0.1 0.10 ± 27% perf-profile.children.cycles-pp.rpc_async_schedule
0.07 ± 23% +0.1 0.12 ± 21% perf-profile.children.cycles-pp.ip6_finish_output2
0.04 ± 72% +0.1 0.09 ± 30% perf-profile.children.cycles-pp.__common_interrupt
0.05 ± 45% +0.1 0.10 ± 20% perf-profile.children.cycles-pp.ip6_input_finish
0.05 ± 46% +0.1 0.10 ± 19% perf-profile.children.cycles-pp.__netif_receive_skb_one_core
0.06 ± 13% +0.1 0.11 ± 21% perf-profile.children.cycles-pp.__napi_poll
0.31 ± 10% +0.1 0.37 ± 4% perf-profile.children.cycles-pp.error_entry
0.07 ± 23% +0.1 0.13 ± 18% perf-profile.children.cycles-pp.ip6_xmit
0.04 ± 73% +0.1 0.09 ± 23% perf-profile.children.cycles-pp.xs_tcp_send_request
0.06 ± 13% +0.1 0.12 ± 18% perf-profile.children.cycles-pp.net_rx_action
0.04 ± 73% +0.1 0.09 ± 22% perf-profile.children.cycles-pp.xprt_request_transmit
0.04 ± 71% +0.1 0.09 ± 23% perf-profile.children.cycles-pp.tcp_v6_do_rcv
0.04 ± 71% +0.1 0.09 ± 23% perf-profile.children.cycles-pp.tcp_rcv_established
0.02 ±145% +0.1 0.08 ± 26% perf-profile.children.cycles-pp.inode_permission
0.07 ± 23% +0.1 0.13 ± 17% perf-profile.children.cycles-pp.inet6_csk_xmit
0.08 ± 17% +0.1 0.14 ± 14% perf-profile.children.cycles-pp.__tcp_transmit_skb
0.05 ± 48% +0.1 0.11 ± 20% perf-profile.children.cycles-pp.rpc_run_task
0.04 ± 71% +0.1 0.10 ± 23% perf-profile.children.cycles-pp.queue_work_on
0.05 ± 46% +0.1 0.11 ± 20% perf-profile.children.cycles-pp.rpc_execute
0.08 ± 23% +0.1 0.15 ± 23% perf-profile.children.cycles-pp.svc_recv
0.08 ± 25% +0.1 0.15 ± 38% perf-profile.children.cycles-pp.do_softirq
0.07 ± 9% +0.1 0.14 ± 16% perf-profile.children.cycles-pp.__tcp_push_pending_frames
0.07 ± 11% +0.1 0.14 ± 16% perf-profile.children.cycles-pp.tcp_write_xmit
0.10 ± 23% +0.1 0.18 ± 12% perf-profile.children.cycles-pp.__rpc_execute
0.08 ± 14% +0.1 0.15 ± 14% perf-profile.children.cycles-pp.__queue_work
0.07 ± 10% +0.1 0.15 ± 15% perf-profile.children.cycles-pp.tcp_sock_set_cork
0.15 ± 16% +0.1 0.24 ± 14% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.13 ± 27% +0.1 0.23 ± 24% perf-profile.children.cycles-pp.open
0.22 ± 12% +0.1 0.32 ± 11% perf-profile.children.cycles-pp.try_to_wake_up
0.18 ± 18% +0.1 0.30 ± 19% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.03 ±100% +0.1 0.16 ± 45% perf-profile.children.cycles-pp.btree_csum_one_bio
0.03 ±100% +0.1 0.16 ± 45% perf-profile.children.cycles-pp.csum_one_extent_buffer
0.29 ± 17% +0.2 0.44 ± 10% perf-profile.children.cycles-pp.unwind_next_frame
0.32 ± 27% +0.2 0.48 ± 14% perf-profile.children.cycles-pp.io_serial_in
0.40 ± 17% +0.2 0.59 ± 10% perf-profile.children.cycles-pp.get_perf_callchain
0.40 ± 17% +0.2 0.59 ± 10% perf-profile.children.cycles-pp.perf_callchain
0.34 ± 16% +0.2 0.53 ± 10% perf-profile.children.cycles-pp.perf_callchain_kernel
0.45 ± 18% +0.2 0.64 ± 10% perf-profile.children.cycles-pp.process_one_work
0.43 ± 16% +0.2 0.62 ± 10% perf-profile.children.cycles-pp.perf_prepare_sample
0.36 ± 19% +0.2 0.58 ± 12% perf-profile.children.cycles-pp.note_gp_changes
0.48 ± 16% +0.2 0.71 ± 11% perf-profile.children.cycles-pp.perf_event_output_forward
0.55 ± 12% +0.2 0.79 ± 8% perf-profile.children.cycles-pp.rcu_core
0.48 ± 15% +0.2 0.72 ± 11% perf-profile.children.cycles-pp.__perf_event_overflow
0.50 ± 15% +0.2 0.75 ± 11% perf-profile.children.cycles-pp.perf_tp_event
0.52 ± 14% +0.3 0.81 ± 11% perf-profile.children.cycles-pp.worker_thread
0.99 ± 13% +0.3 1.28 ± 5% perf-profile.children.cycles-pp.irqtime_account_irq
1.54 ± 12% +0.4 1.90 ± 4% perf-profile.children.cycles-pp.sched_clock_cpu
3.20 ± 8% +0.7 3.87 ± 9% perf-profile.children.cycles-pp.__softirqentry_text_start
3.88 ± 9% +0.8 4.64 ± 10% perf-profile.children.cycles-pp.__irq_exit_rcu
1.18 ± 7% +0.9 2.05 ± 18% perf-profile.children.cycles-pp.kthread
1.19 ± 7% +0.9 2.07 ± 18% perf-profile.children.cycles-pp.ret_from_fork
25.11 ± 8% +3.4 28.51 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
54.23 ± 6% -6.4 47.83 ± 2% perf-profile.self.cycles-pp.mwait_idle_with_hints
0.22 ± 11% +0.1 0.30 ± 17% perf-profile.self.cycles-pp.sched_clock_cpu
0.32 ± 27% +0.2 0.48 ± 14% perf-profile.self.cycles-pp.io_serial_in
1.19 ± 12% +0.2 1.44 ± 4% perf-profile.self.cycles-pp.native_sched_clock
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-5.18.0-rc1-00362-g70bed0d5447e" of type "text/plain" (163540 bytes)
View attachment "job-script" of type "text/plain" (8500 bytes)
View attachment "job.yaml" of type "text/plain" (5927 bytes)
View attachment "reproduce" of type "text/plain" (875 bytes)
Powered by blists - more mailing lists