[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190102004002.GB17624@shao2-debian>
Date: Wed, 2 Jan 2019 08:40:02 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Theodore Ts'o <tytso@....edu>
Cc: LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [ext4] fde872682e: fsmark.files_per_sec -38.0% regression
Greeting,
FYI, we noticed a -38.0% regression of fsmark.files_per_sec due to commit:
commit: fde872682e175743e0c3ef939c89e3c6008a1529 ("ext4: force inode writes when nfsd calls commit_metadata()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fsmark
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:
iterations: 4
disk: 1SSD
nr_threads: 4
fs: ext4
fs2: nfsv4
filesize: 9B
test_size: 10G
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
ucode: 0x3d
test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
gcc-7/performance/1SSD/9B/nfsv4/ext4/4/x86_64-rhel-7.2/16d/256fpd/4/debian-x86_64-2018-04-03.cgz/fsyncBeforeClose/lkp-hsw-ep4/10G/fsmark/0x3d
commit:
8a363970d1 ("ext4: avoid declaring fs inconsistent due to invalid file handles")
fde872682e ("ext4: force inode writes when nfsd calls commit_metadata()")
8a363970d1dc38c4 fde872682e175743e0c3ef939c
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#]
1:4 -15% 1:4 perf-profile.children.cycles-pp.schedule_timeout
%stddev %change %stddev
\ | \
47200062 -2.6% 45958374 fsmark.app_overhead
6204 -38.0% 3843 fsmark.files_per_sec
434.27 +59.4% 692.36 fsmark.time.elapsed_time
434.27 +59.4% 692.36 fsmark.time.elapsed_time.max
57.75 -26.8% 42.25 fsmark.time.percent_of_cpu_this_job_got
232.03 +15.2% 267.32 fsmark.time.system_time
21.20 ± 2% +23.9% 26.27 ± 2% fsmark.time.user_time
11204897 -4.8% 10664974 fsmark.time.voluntary_context_switches
289016 +50.5% 435058 interrupts.CAL:Function_call_interrupts
9808911 ± 3% +57.4% 15436821 ± 7% softirqs.RCU
4336991 +56.8% 6800859 softirqs.SCHED
12876475 ± 9% +56.1% 20097329 ± 10% softirqs.TIMER
104137 -17.0% 86414 ± 2% meminfo.Dirty
590.75 ± 4% +168.0% 1583 ± 24% meminfo.Mlocked
1963 ± 3% -53.0% 922.75 ± 12% meminfo.Writeback
75675 -37.4% 47358 meminfo.max_used_kB
0.72 ± 7% -0.3 0.43 ± 8% mpstat.cpu.iowait%
0.35 ± 2% -0.1 0.25 ± 5% mpstat.cpu.soft%
3.05 -1.0 2.01 ± 5% mpstat.cpu.sys%
0.07 ± 3% -0.0 0.06 ± 3% mpstat.cpu.usr%
271792 -33.4% 181130 vmstat.io.bo
2.00 -50.0% 1.00 vmstat.procs.r
397858 -35.3% 257252 vmstat.system.cs
169073 -4.3% 161744 vmstat.system.in
19.00 -15.8% 16.00 nfsstat.Client.nfs.v3.access.percent
8.00 +59.4% 12.75 ± 3% nfsstat.Client.nfs.v3.commit
15.25 ± 5% +29.5% 19.75 ± 2% nfsstat.Client.nfs.v3.commit.percent
21.50 ± 2% -16.3% 18.00 nfsstat.Client.nfs.v3.lookup.percent
8.00 +59.4% 12.75 ± 3% nfsstat.Client.nfs.v3.write
15.25 ± 5% +32.8% 20.25 ± 2% nfsstat.Client.nfs.v3.write.percent
18.25 ± 2% +57.5% 28.75 ± 2% nfsstat.Client.nfs.v4.getattr
320.50 ± 13% +161.5% 838.25 ± 22% numa-meminfo.node0.Mlocked
995.75 ± 6% -58.1% 416.75 ± 10% numa-meminfo.node0.Writeback
49384 ± 10% -29.3% 34920 ± 13% numa-meminfo.node1.Dirty
10081 +24.4% 12542 ± 18% numa-meminfo.node1.Mapped
270.25 ± 14% +175.5% 744.50 ± 35% numa-meminfo.node1.Mlocked
5495 ± 40% +171.8% 14937 ± 43% numa-meminfo.node1.Shmem
1003 ± 4% -46.6% 535.75 ± 23% numa-meminfo.node1.Writeback
79.75 ± 13% +162.1% 209.00 ± 22% numa-vmstat.node0.nr_mlock
244.25 ± 4% -57.9% 102.75 ± 7% numa-vmstat.node0.nr_writeback
12344 ± 10% -29.3% 8728 ± 13% numa-vmstat.node1.nr_dirty
67.25 ± 14% +176.2% 185.75 ± 35% numa-vmstat.node1.nr_mlock
1373 ± 40% +171.9% 3734 ± 43% numa-vmstat.node1.nr_shmem
261.75 ± 3% -52.7% 123.75 ± 15% numa-vmstat.node1.nr_writeback
12550 ± 9% -29.5% 8847 ± 12% numa-vmstat.node1.nr_zone_write_pending
41916654 -25.0% 31433195 cpuidle.C1.usage
2.552e+09 ± 2% +53.6% 3.92e+09 ± 24% cpuidle.C1E.time
41366264 +28.4% 53131638 ± 18% cpuidle.C1E.usage
43473347 ± 16% +51.2% 65723309 ± 9% cpuidle.C3.usage
9.254e+09 ± 34% +150.2% 2.315e+10 ± 17% cpuidle.C6.time
19046402 ± 41% +92.9% 36731726 ± 10% cpuidle.C6.usage
44666028 -28.2% 32062945 cpuidle.POLL.time
11872527 -27.3% 8631232 cpuidle.POLL.usage
174.50 -24.8% 131.25 ± 2% turbostat.Avg_MHz
6.22 -1.4 4.80 ± 2% turbostat.Busy%
41909736 -25.0% 31427203 turbostat.C1
5.11 ± 2% -2.0 3.07 ± 2% turbostat.C1%
41365164 +28.4% 53130249 ± 18% turbostat.C1E
43472400 ± 16% +51.2% 65722400 ± 9% turbostat.C3
19040126 ± 41% +92.9% 36726440 ± 10% turbostat.C6
29.32 ± 33% +16.8 46.16 ± 17% turbostat.C6%
8.15 ± 40% +132.1% 18.92 ± 35% turbostat.CPU%c6
73871564 +52.7% 1.128e+08 turbostat.IRQ
1.08 ± 19% +76.0% 1.91 ± 38% turbostat.Pkg%pc2
165.05 -8.2% 151.53 ± 3% turbostat.PkgWatt
126.50 +5.1% 133.00 proc-vmstat.nr_anon_transparent_hugepages
5722040 +2.9% 5885724 proc-vmstat.nr_dirtied
25925 -16.6% 21627 ± 2% proc-vmstat.nr_dirty
147.50 ± 4% +168.1% 395.50 ± 24% proc-vmstat.nr_mlock
464.75 ± 3% -50.7% 229.00 ± 13% proc-vmstat.nr_writeback
5698733 +2.8% 5856159 proc-vmstat.nr_written
26310 -16.9% 21857 ± 2% proc-vmstat.nr_zone_write_pending
1434 ± 7% -18.1% 1175 ± 13% proc-vmstat.numa_hint_faults
20765868 +2.7% 21325775 proc-vmstat.numa_hit
20751707 +2.7% 21311589 proc-vmstat.numa_local
41882 ± 6% -35.5% 27003 ± 11% proc-vmstat.numa_pages_migrated
30334205 +1.7% 30851916 proc-vmstat.pgalloc_normal
1246899 +57.3% 1961018 proc-vmstat.pgfault
25996781 +2.5% 26652253 proc-vmstat.pgfree
41882 ± 6% -35.5% 27003 ± 11% proc-vmstat.pgmigrate_success
1.189e+08 +6.3% 1.264e+08 proc-vmstat.pgpgout
14068 +21.1% 17034 ± 2% slabinfo.dquot.active_objs
14736 +17.7% 17351 ± 2% slabinfo.dquot.num_objs
23137 -21.9% 18060 ± 2% slabinfo.jbd2_journal_head.active_objs
698.25 -20.9% 552.50 ± 2% slabinfo.jbd2_journal_head.active_slabs
23758 -20.8% 18805 ± 2% slabinfo.jbd2_journal_head.num_objs
698.25 -20.9% 552.50 ± 2% slabinfo.jbd2_journal_head.num_slabs
14804 ± 8% -28.9% 10518 ± 4% slabinfo.kmalloc-192.active_objs
352.00 ± 8% -28.9% 250.25 ± 4% slabinfo.kmalloc-192.active_slabs
14805 ± 8% -29.0% 10519 ± 4% slabinfo.kmalloc-192.num_objs
352.00 ± 8% -28.9% 250.25 ± 4% slabinfo.kmalloc-192.num_slabs
368.50 ± 9% +34.9% 497.25 ± 3% slabinfo.nfs_commit_data.active_objs
368.50 ± 9% +34.9% 497.25 ± 3% slabinfo.nfs_commit_data.num_objs
23380 ± 2% -57.3% 9981 ± 5% slabinfo.numa_policy.active_objs
379.75 ± 2% -57.7% 160.50 ± 5% slabinfo.numa_policy.active_slabs
23571 ± 2% -57.7% 9981 ± 5% slabinfo.numa_policy.num_objs
379.75 ± 2% -57.7% 160.50 ± 5% slabinfo.numa_policy.num_slabs
3835 ± 5% +9.4% 4196 ± 4% slabinfo.pool_workqueue.active_objs
3944 ± 4% +9.0% 4299 ± 4% slabinfo.pool_workqueue.num_objs
2.12 ± 57% +0.9 2.99 ± 3% perf-stat.i.branch-miss-rate%
2.02 ± 58% +45.3% 2.93 ± 3% perf-stat.i.cpi
0.22 ± 62% +0.2 0.39 ± 8% perf-stat.i.dTLB-load-miss-rate%
14.84 ± 59% +10.4 25.26 ± 14% perf-stat.i.iTLB-load-miss-rate%
2.05 ± 57% +0.9 2.93 ± 3% perf-stat.overall.branch-miss-rate%
1.92 ± 57% +48.8% 2.86 ± 4% perf-stat.overall.cpi
0.21 ± 60% +0.2 0.38 ± 8% perf-stat.overall.dTLB-load-miss-rate%
0.06 ± 59% +0.0 0.10 ± 13% perf-stat.overall.dTLB-store-miss-rate%
13.94 ± 58% +10.7 24.66 ± 14% perf-stat.overall.iTLB-load-miss-rate%
3.364e+11 ± 60% +64.1% 5.52e+11 ± 2% perf-stat.total.branch-instructions
9.228e+09 ± 60% +75.6% 1.62e+10 ± 5% perf-stat.total.branch-misses
7.87e+10 ± 60% +71.3% 1.348e+11 ± 4% perf-stat.total.cache-references
1.169e+08 ± 60% +53.5% 1.795e+08 perf-stat.total.context-switches
4.006e+12 ± 60% +83.2% 7.341e+12 ± 5% perf-stat.total.cpu-cycles
44664 ± 61% +107.5% 92668 ± 10% perf-stat.total.cpu-migrations
1.301e+09 ± 66% +118.4% 2.842e+09 ± 10% perf-stat.total.dTLB-load-misses
4.504e+11 ± 60% +63.9% 7.382e+11 ± 3% perf-stat.total.dTLB-loads
2.802e+08 ± 65% +98.5% 5.563e+08 ± 16% perf-stat.total.dTLB-store-misses
3.237e+11 ± 61% +78.0% 5.761e+11 ± 6% perf-stat.total.dTLB-stores
3.355e+08 ± 62% +135.3% 7.896e+08 ± 16% perf-stat.total.iTLB-load-misses
1.461e+09 ± 60% +64.6% 2.406e+09 ± 4% perf-stat.total.iTLB-loads
1.56e+12 ± 60% +64.7% 2.568e+12 ± 2% perf-stat.total.instructions
814263 ± 61% +135.0% 1913871 perf-stat.total.minor-faults
1.659e+08 ± 61% +91.5% 3.178e+08 ± 4% perf-stat.total.node-loads
814279 ± 61% +135.0% 1913880 perf-stat.total.page-faults
8306 ± 6% +13.8% 9451 sched_debug.cfs_rq:/.exec_clock.avg
11550 ± 7% +28.5% 14846 ± 6% sched_debug.cfs_rq:/.exec_clock.max
1273 ± 9% +71.8% 2188 ± 17% sched_debug.cfs_rq:/.exec_clock.stddev
300.14 ± 4% -10.4% 268.93 ± 6% sched_debug.cfs_rq:/.load_avg.avg
0.09 ± 6% -21.6% 0.07 ± 5% sched_debug.cfs_rq:/.nr_running.avg
0.28 ± 2% -9.3% 0.25 ± 2% sched_debug.cfs_rq:/.nr_running.stddev
0.13 ± 20% +143.1% 0.32 ± 7% sched_debug.cfs_rq:/.nr_spread_over.avg
0.39 ± 16% +62.1% 0.62 ± 17% sched_debug.cfs_rq:/.nr_spread_over.stddev
106.44 ± 4% -25.6% 79.20 ± 7% sched_debug.cfs_rq:/.util_avg.avg
141.31 -14.1% 121.32 ± 5% sched_debug.cfs_rq:/.util_avg.stddev
18.40 ± 6% -51.7% 8.90 ± 9% sched_debug.cfs_rq:/.util_est_enqueued.avg
517.95 ± 15% -43.6% 292.23 ± 6% sched_debug.cfs_rq:/.util_est_enqueued.max
80.58 ± 11% -45.7% 43.76 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.stddev
242176 ± 6% +53.2% 371011 sched_debug.cpu.clock.avg
242180 ± 6% +53.2% 371014 sched_debug.cpu.clock.max
242171 ± 6% +53.2% 371007 sched_debug.cpu.clock.min
242176 ± 6% +53.2% 371011 sched_debug.cpu.clock_task.avg
242180 ± 6% +53.2% 371014 sched_debug.cpu.clock_task.max
242171 ± 6% +53.2% 371007 sched_debug.cpu.clock_task.min
35.76 ± 15% -27.0% 26.12 ± 13% sched_debug.cpu.cpu_load[1].avg
34.55 ± 11% -30.0% 24.20 ± 6% sched_debug.cpu.cpu_load[2].avg
1246 ± 8% -29.4% 879.79 ± 12% sched_debug.cpu.cpu_load[2].max
161.68 ± 9% -28.6% 115.50 ± 10% sched_debug.cpu.cpu_load[2].stddev
33.24 ± 8% -35.2% 21.54 ± 5% sched_debug.cpu.cpu_load[3].avg
1104 ± 6% -36.5% 701.37 ± 8% sched_debug.cpu.cpu_load[3].max
144.33 ± 5% -34.5% 94.55 ± 7% sched_debug.cpu.cpu_load[3].stddev
32.05 ± 6% -40.9% 18.94 ± 3% sched_debug.cpu.cpu_load[4].avg
1036 ± 4% -42.6% 595.23 ± 9% sched_debug.cpu.cpu_load[4].max
135.30 ± 4% -39.8% 81.41 ± 5% sched_debug.cpu.cpu_load[4].stddev
7345 ± 5% +51.0% 11090 sched_debug.cpu.curr->pid.max
1143 ± 5% +35.1% 1544 ± 6% sched_debug.cpu.curr->pid.stddev
200600 ± 7% +67.2% 335333 sched_debug.cpu.nr_load_updates.avg
212000 ± 6% +63.6% 346878 sched_debug.cpu.nr_load_updates.max
198375 ± 7% +67.7% 332700 sched_debug.cpu.nr_load_updates.min
0.08 ± 10% -21.6% 0.07 ± 5% sched_debug.cpu.nr_running.avg
0.27 ± 3% -9.4% 0.24 ± 2% sched_debug.cpu.nr_running.stddev
1583224 ± 7% +22.4% 1937784 ± 8% sched_debug.cpu.nr_switches.max
177319 ± 9% +67.1% 296352 ± 16% sched_debug.cpu.nr_switches.stddev
41.99 ± 5% -16.3% 35.17 ± 9% sched_debug.cpu.nr_uninterruptible.max
19.03 ± 5% -18.0% 15.62 ± 8% sched_debug.cpu.nr_uninterruptible.stddev
1585283 ± 7% +22.1% 1935939 ± 8% sched_debug.cpu.sched_count.max
177587 ± 9% +66.6% 295878 ± 16% sched_debug.cpu.sched_count.stddev
764456 ± 7% +22.7% 937804 ± 8% sched_debug.cpu.sched_goidle.max
85178 ± 9% +67.9% 143017 ± 17% sched_debug.cpu.sched_goidle.stddev
888381 ± 8% +23.8% 1100076 ± 3% sched_debug.cpu.ttwu_count.max
120728 ± 7% +49.0% 179858 ± 5% sched_debug.cpu.ttwu_count.stddev
44554 ± 9% +19.3% 53138 ± 3% sched_debug.cpu.ttwu_local.stddev
242172 ± 6% +53.2% 371007 sched_debug.cpu_clk
238510 ± 6% +54.0% 367345 sched_debug.ktime
0.00 ± 39% +214.7% 0.00 ± 28% sched_debug.rt_rq:/.rt_time.avg
0.01 ± 39% +214.7% 0.02 ± 28% sched_debug.rt_rq:/.rt_time.max
0.00 ± 39% +214.7% 0.00 ± 28% sched_debug.rt_rq:/.rt_time.stddev
243314 ± 6% +53.0% 372174 sched_debug.sched_clk
19.37 ± 10% -4.4 14.93 ± 14% perf-profile.calltrace.cycles-pp.ret_from_fork
19.37 ± 10% -4.4 14.93 ± 14% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
11.27 ± 15% -3.6 7.63 ± 18% perf-profile.calltrace.cycles-pp.nfsd.kthread.ret_from_fork
5.95 ± 15% -2.0 3.98 ± 14% perf-profile.calltrace.cycles-pp.svc_process.nfsd.kthread.ret_from_fork
5.94 ± 16% -2.0 3.98 ± 14% perf-profile.calltrace.cycles-pp.svc_process_common.svc_process.nfsd.kthread.ret_from_fork
5.52 ± 15% -1.8 3.69 ± 14% perf-profile.calltrace.cycles-pp.nfsd_dispatch.svc_process_common.svc_process.nfsd.kthread
5.12 ± 16% -1.7 3.43 ± 13% perf-profile.calltrace.cycles-pp.nfsd4_proc_compound.nfsd_dispatch.svc_process_common.svc_process.nfsd
2.67 ± 16% -1.0 1.70 ± 23% perf-profile.calltrace.cycles-pp.svc_tcp_sendto.svc_send.nfsd.kthread.ret_from_fork
2.67 ± 16% -1.0 1.70 ± 23% perf-profile.calltrace.cycles-pp.svc_sendto.svc_tcp_sendto.svc_send.nfsd.kthread
2.65 ± 16% -1.0 1.68 ± 23% perf-profile.calltrace.cycles-pp.svc_send_common.svc_sendto.svc_tcp_sendto.svc_send.nfsd
2.64 ± 16% -1.0 1.67 ± 23% perf-profile.calltrace.cycles-pp.kernel_sendpage.svc_send_common.svc_sendto.svc_tcp_sendto.svc_send
2.63 ± 16% -1.0 1.67 ± 23% perf-profile.calltrace.cycles-pp.inet_sendpage.kernel_sendpage.svc_send_common.svc_sendto.svc_tcp_sendto
2.60 ± 15% -0.9 1.66 ± 23% perf-profile.calltrace.cycles-pp.tcp_sendpage.inet_sendpage.kernel_sendpage.svc_send_common.svc_sendto
2.01 ± 13% -0.8 1.16 ± 22% perf-profile.calltrace.cycles-pp.nfsd4_write.nfsd4_proc_compound.nfsd_dispatch.svc_process_common.svc_process
1.94 ± 14% -0.8 1.13 ± 21% perf-profile.calltrace.cycles-pp.nfsd_vfs_write.nfsd4_write.nfsd4_proc_compound.nfsd_dispatch.svc_process_common
1.90 ± 13% -0.8 1.10 ± 21% perf-profile.calltrace.cycles-pp.do_iter_readv_writev.do_iter_write.nfsd_vfs_write.nfsd4_write.nfsd4_proc_compound
1.91 ± 13% -0.8 1.12 ± 21% perf-profile.calltrace.cycles-pp.do_iter_write.nfsd_vfs_write.nfsd4_write.nfsd4_proc_compound.nfsd_dispatch
1.89 ± 13% -0.8 1.10 ± 21% perf-profile.calltrace.cycles-pp.ext4_file_write_iter.do_iter_readv_writev.do_iter_write.nfsd_vfs_write.nfsd4_write
2.11 ± 15% -0.7 1.39 ± 25% perf-profile.calltrace.cycles-pp.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage
2.12 ± 15% -0.7 1.41 ± 25% perf-profile.calltrace.cycles-pp.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage.svc_send_common
1.90 ± 16% -0.7 1.24 ± 25% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage
1.90 ± 16% -0.7 1.24 ± 25% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage
1.70 ± 16% -0.6 1.12 ± 26% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked
1.54 ± 16% -0.5 1.03 ± 26% perf-profile.calltrace.cycles-pp.inet6_csk_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages
1.11 ± 14% -0.4 0.69 ± 18% perf-profile.calltrace.cycles-pp.ext4_sync_file.ext4_file_write_iter.do_iter_readv_writev.do_iter_write.nfsd_vfs_write
0.93 ± 16% -0.3 0.59 ± 19% perf-profile.calltrace.cycles-pp.file_write_and_wait_range.ext4_sync_file.ext4_file_write_iter.do_iter_readv_writev.do_iter_write
1.10 ± 17% +0.5 1.56 ± 14% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
1.21 ± 17% +0.5 1.68 ± 13% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
1.36 ± 17% +0.6 1.94 ± 14% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
0.89 ± 19% +0.7 1.64 ± 27% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
1.00 ± 18% +0.8 1.80 ± 27% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.xs_send_kvec
0.99 ± 18% +0.8 1.80 ± 27% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
2.19 ± 10% +0.9 3.06 ± 9% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
3.31 ± 7% +1.3 4.63 ± 10% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle
7.83 ± 4% +1.9 9.71 ± 6% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
6.19 ± 6% +2.0 8.17 ± 9% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry
65.59 ± 2% +4.3 69.86 ± 4% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
73.52 ± 2% +4.6 78.17 ± 3% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
73.52 ± 2% +4.6 78.16 ± 3% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
73.46 ± 2% +4.7 78.11 ± 3% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
19.37 ± 10% -4.4 14.93 ± 14% perf-profile.children.cycles-pp.kthread
19.37 ± 10% -4.4 14.94 ± 14% perf-profile.children.cycles-pp.ret_from_fork
11.27 ± 15% -3.6 7.63 ± 18% perf-profile.children.cycles-pp.nfsd
5.95 ± 15% -2.0 3.98 ± 14% perf-profile.children.cycles-pp.svc_process
5.94 ± 16% -2.0 3.98 ± 14% perf-profile.children.cycles-pp.svc_process_common
5.52 ± 15% -1.8 3.69 ± 14% perf-profile.children.cycles-pp.nfsd_dispatch
5.12 ± 16% -1.7 3.43 ± 13% perf-profile.children.cycles-pp.nfsd4_proc_compound
2.67 ± 16% -1.0 1.70 ± 23% perf-profile.children.cycles-pp.svc_tcp_sendto
2.67 ± 16% -1.0 1.70 ± 23% perf-profile.children.cycles-pp.svc_sendto
2.65 ± 16% -1.0 1.68 ± 23% perf-profile.children.cycles-pp.svc_send_common
2.64 ± 16% -1.0 1.68 ± 23% perf-profile.children.cycles-pp.kernel_sendpage
2.65 ± 16% -1.0 1.69 ± 23% perf-profile.children.cycles-pp.inet_sendpage
2.62 ± 16% -0.9 1.68 ± 23% perf-profile.children.cycles-pp.tcp_sendpage
2.01 ± 13% -0.8 1.16 ± 22% perf-profile.children.cycles-pp.nfsd4_write
1.94 ± 14% -0.8 1.13 ± 21% perf-profile.children.cycles-pp.nfsd_vfs_write
1.21 ± 19% -0.8 0.41 ± 11% perf-profile.children.cycles-pp.poll_idle
1.90 ± 13% -0.8 1.10 ± 21% perf-profile.children.cycles-pp.do_iter_readv_writev
1.91 ± 13% -0.8 1.12 ± 21% perf-profile.children.cycles-pp.do_iter_write
1.89 ± 13% -0.8 1.10 ± 21% perf-profile.children.cycles-pp.ext4_file_write_iter
2.14 ± 15% -0.7 1.42 ± 24% perf-profile.children.cycles-pp.tcp_sendpage_locked
2.12 ± 15% -0.7 1.41 ± 24% perf-profile.children.cycles-pp.do_tcp_sendpages
1.11 ± 14% -0.4 0.69 ± 18% perf-profile.children.cycles-pp.ext4_sync_file
2.71 ± 7% -0.4 2.33 ± 10% perf-profile.children.cycles-pp.__account_scheduler_latency
0.76 ± 13% -0.4 0.40 ± 28% perf-profile.children.cycles-pp.__generic_file_write_iter
0.98 ± 7% -0.3 0.64 ± 20% perf-profile.children.cycles-pp.generic_perform_write
0.93 ± 16% -0.3 0.59 ± 19% perf-profile.children.cycles-pp.file_write_and_wait_range
0.69 ± 9% -0.3 0.35 ± 14% perf-profile.children.cycles-pp.release_sock
2.10 ± 8% -0.3 1.76 ± 10% perf-profile.children.cycles-pp.__save_stack_trace
0.81 ± 16% -0.3 0.51 ± 17% perf-profile.children.cycles-pp.ext4_writepages
0.79 ± 16% -0.3 0.49 ± 23% perf-profile.children.cycles-pp.fh_verify
1.48 ± 8% -0.3 1.19 ± 10% perf-profile.children.cycles-pp.unwind_next_frame
0.41 ± 17% -0.2 0.21 ± 23% perf-profile.children.cycles-pp.ext4_da_write_begin
0.47 ± 17% -0.2 0.27 ± 14% perf-profile.children.cycles-pp.ext4_map_blocks
0.64 ± 20% -0.2 0.44 ± 24% perf-profile.children.cycles-pp.__wake_up_bit
0.36 ± 20% -0.2 0.20 ± 27% perf-profile.children.cycles-pp.ext4_ext_map_blocks
0.30 ± 10% -0.2 0.14 ± 10% perf-profile.children.cycles-pp.__release_sock
0.72 ± 11% -0.2 0.56 ± 16% perf-profile.children.cycles-pp.pick_next_task_fair
0.36 ± 41% -0.2 0.21 ± 38% perf-profile.children.cycles-pp.ext4_end_bio
0.34 ± 38% -0.2 0.19 ± 41% perf-profile.children.cycles-pp.ext4_finish_bio
0.48 ± 19% -0.1 0.33 ± 29% perf-profile.children.cycles-pp.svc_xprt_received
0.28 ± 20% -0.1 0.13 ± 42% perf-profile.children.cycles-pp.jbd2__journal_start
0.41 ± 13% -0.1 0.28 ± 20% perf-profile.children.cycles-pp.ext4_mark_inode_dirty
0.16 ± 6% -0.1 0.03 ±100% perf-profile.children.cycles-pp.__lock_sock
0.37 ± 2% -0.1 0.24 ± 24% perf-profile.children.cycles-pp.lock_sock_nested
0.33 ± 15% -0.1 0.20 ± 26% perf-profile.children.cycles-pp.nfsd_setuser_and_check_port
0.26 ± 19% -0.1 0.13 ± 38% perf-profile.children.cycles-pp.start_this_handle
0.31 ± 16% -0.1 0.18 ± 26% perf-profile.children.cycles-pp.nfsd_setuser
0.24 ± 45% -0.1 0.11 ± 7% perf-profile.children.cycles-pp.__blk_mq_try_issue_directly
0.17 ± 23% -0.1 0.05 ± 67% perf-profile.children.cycles-pp.add_transaction_credits
0.16 ± 23% -0.1 0.04 ± 60% perf-profile.children.cycles-pp.__jbd2_log_wait_for_space
0.22 ± 48% -0.1 0.10 ± 10% perf-profile.children.cycles-pp.nvme_queue_rq
0.21 ± 23% -0.1 0.10 ± 35% perf-profile.children.cycles-pp.ext4_mb_new_blocks
0.39 ± 21% -0.1 0.28 ± 27% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.27 ± 15% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.__ext4_journal_get_write_access
0.22 ± 8% -0.1 0.11 ± 33% perf-profile.children.cycles-pp.ext4_da_write_end
0.25 ± 16% -0.1 0.15 ± 3% perf-profile.children.cycles-pp.jbd2_journal_get_write_access
0.28 ± 3% -0.1 0.17 ± 39% perf-profile.children.cycles-pp.__mark_inode_dirty
0.26 ± 16% -0.1 0.15 ± 21% perf-profile.children.cycles-pp.prepare_creds
0.16 ± 29% -0.1 0.06 ± 63% perf-profile.children.cycles-pp.ext4_mb_mark_diskspace_used
0.29 ± 18% -0.1 0.19 ± 7% perf-profile.children.cycles-pp.__getblk_gfp
0.27 ± 16% -0.1 0.17 ± 8% perf-profile.children.cycles-pp.__find_get_block
0.16 ± 6% -0.1 0.07 ± 58% perf-profile.children.cycles-pp.nfsd_cache_lookup
0.19 ± 22% -0.1 0.10 ± 8% perf-profile.children.cycles-pp.blk_mq_flush_plug_list
0.20 ± 23% -0.1 0.10 ± 10% perf-profile.children.cycles-pp.blk_finish_plug
0.20 ± 24% -0.1 0.10 ± 10% perf-profile.children.cycles-pp.blk_flush_plug_list
0.44 ± 16% -0.1 0.35 ± 20% perf-profile.children.cycles-pp.tcp_ack
0.32 ± 11% -0.1 0.22 ± 11% perf-profile.children.cycles-pp.pagecache_get_page
0.36 ± 7% -0.1 0.27 ± 12% perf-profile.children.cycles-pp.__orc_find
0.23 ± 30% -0.1 0.14 ± 13% perf-profile.children.cycles-pp.__jbd2_journal_file_buffer
0.31 ± 5% -0.1 0.23 ± 14% perf-profile.children.cycles-pp.kmem_cache_alloc
0.36 ± 9% -0.1 0.28 ± 9% perf-profile.children.cycles-pp.orc_find
0.34 ± 17% -0.1 0.26 ± 18% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
0.14 ± 28% -0.1 0.06 ± 62% perf-profile.children.cycles-pp.bit_wait_io
0.26 ± 22% -0.1 0.19 ± 19% perf-profile.children.cycles-pp.__lookup_slow
0.30 ± 15% -0.1 0.23 ± 17% perf-profile.children.cycles-pp.__kfree_skb
0.27 ± 10% -0.1 0.20 ± 17% perf-profile.children.cycles-pp.io_schedule
0.12 ± 16% -0.1 0.05 ± 62% perf-profile.children.cycles-pp.nfsd4_getattr
0.17 ± 22% -0.1 0.10 ± 11% perf-profile.children.cycles-pp.blk_mq_sched_insert_requests
0.22 ± 15% -0.1 0.15 ± 14% perf-profile.children.cycles-pp.put_cred_rcu
0.18 ± 20% -0.1 0.11 ± 29% perf-profile.children.cycles-pp.ext4_mark_iloc_dirty
0.23 ± 16% -0.1 0.17 ± 5% perf-profile.children.cycles-pp.__slab_free
0.21 ± 16% -0.1 0.14 ± 29% perf-profile.children.cycles-pp.memset_erms
0.10 ± 24% -0.1 0.03 ±100% perf-profile.children.cycles-pp.generic_make_request_checks
0.17 ± 17% -0.1 0.10 ± 12% perf-profile.children.cycles-pp.do_get_write_access
0.16 ± 22% -0.1 0.10 ± 11% perf-profile.children.cycles-pp.blk_mq_try_issue_list_directly
0.15 ± 24% -0.1 0.09 ± 8% perf-profile.children.cycles-pp.blk_mq_request_issue_directly
0.13 ± 11% -0.1 0.07 ± 61% perf-profile.children.cycles-pp.__generic_write_end
0.11 ± 20% -0.1 0.04 ± 59% perf-profile.children.cycles-pp.jbd2_journal_dirty_metadata
0.12 ± 23% -0.1 0.07 ± 17% perf-profile.children.cycles-pp.__ext4_handle_dirty_metadata
0.22 ± 12% -0.1 0.16 ± 14% perf-profile.children.cycles-pp.ext4_reserve_inode_write
0.14 ± 20% -0.1 0.09 ± 29% perf-profile.children.cycles-pp.ext4_block_write_begin
0.23 ± 14% -0.1 0.17 ± 19% perf-profile.children.cycles-pp.set_next_entity
0.18 ± 14% -0.1 0.13 ± 11% perf-profile.children.cycles-pp.nfsd4_process_open2
0.17 ± 15% -0.0 0.12 ± 5% perf-profile.children.cycles-pp.___might_sleep
0.11 ± 25% -0.0 0.07 ± 17% perf-profile.children.cycles-pp.crc32c_pcl_intel_update
0.13 ± 23% -0.0 0.09 ± 20% perf-profile.children.cycles-pp.skb_release_data
0.13 ± 13% -0.0 0.09 ± 13% perf-profile.children.cycles-pp.nfs4_get_vfs_file
0.12 ± 18% -0.0 0.09 ± 26% perf-profile.children.cycles-pp.get_page_from_freelist
0.08 ± 19% -0.0 0.04 ± 60% perf-profile.children.cycles-pp._copy_to_iter
0.09 ± 11% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.xas_load
0.09 ± 9% -0.0 0.06 ± 22% perf-profile.children.cycles-pp.nfsd4_close
0.14 ± 3% +0.0 0.17 ± 10% perf-profile.children.cycles-pp.tick_nohz_irq_exit
0.06 ± 61% +0.1 0.11 ± 11% perf-profile.children.cycles-pp.cpu_load_update
0.06 ± 22% +0.1 0.11 perf-profile.children.cycles-pp.account_process_tick
0.36 ± 3% +0.1 0.44 ± 11% perf-profile.children.cycles-pp.native_write_msr
0.06 ± 20% +0.1 0.14 ± 15% perf-profile.children.cycles-pp.jbd2_log_wait_commit
0.32 ± 7% +0.1 0.41 ± 10% perf-profile.children.cycles-pp.lapic_next_deadline
0.00 +0.1 0.13 ± 5% perf-profile.children.cycles-pp.__jbd2_journal_force_commit
0.00 +0.1 0.13 ± 8% perf-profile.children.cycles-pp.nfsd_create_setattr
0.00 +0.1 0.13 ± 3% perf-profile.children.cycles-pp.jbd2_journal_force_commit
0.00 +0.1 0.14 ± 6% perf-profile.children.cycles-pp.ext4_write_inode
0.00 +0.1 0.14 ± 6% perf-profile.children.cycles-pp.ext4_nfs_commit_metadata
0.21 ± 12% +0.2 0.41 ± 30% perf-profile.children.cycles-pp.rcu_check_callbacks
1.17 ± 14% +0.4 1.61 ± 15% perf-profile.children.cycles-pp.update_process_times
1.28 ± 14% +0.5 1.74 ± 14% perf-profile.children.cycles-pp.tick_sched_handle
1.44 ± 14% +0.6 2.00 ± 15% perf-profile.children.cycles-pp.tick_sched_timer
2.33 ± 7% +0.8 3.16 ± 10% perf-profile.children.cycles-pp.__hrtimer_run_queues
3.52 ± 5% +1.3 4.78 ± 11% perf-profile.children.cycles-pp.hrtimer_interrupt
7.47 ± 3% +1.8 9.24 ± 8% perf-profile.children.cycles-pp.apic_timer_interrupt
6.57 ± 4% +1.8 8.41 ± 10% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
73.52 ± 2% +4.6 78.17 ± 3% perf-profile.children.cycles-pp.start_secondary
0.93 ± 21% -0.6 0.33 ± 16% perf-profile.self.cycles-pp.poll_idle
0.35 ± 8% -0.1 0.27 ± 13% perf-profile.self.cycles-pp.__orc_find
0.23 ± 17% -0.1 0.16 ± 4% perf-profile.self.cycles-pp.__slab_free
0.23 ± 8% -0.1 0.16 ± 13% perf-profile.self.cycles-pp.orc_find
0.11 ± 17% -0.1 0.05 ± 58% perf-profile.self.cycles-pp.__find_get_block
0.08 ± 8% -0.1 0.03 ±100% perf-profile.self.cycles-pp.nfsd_cache_lookup
0.11 ± 15% -0.0 0.07 ± 17% perf-profile.self.cycles-pp.__jbd2_journal_file_buffer
0.16 ± 13% -0.0 0.12 ± 9% perf-profile.self.cycles-pp.___might_sleep
0.14 ± 18% -0.0 0.09 ± 15% perf-profile.self.cycles-pp.prepare_creds
0.07 ± 17% -0.0 0.03 ±102% perf-profile.self.cycles-pp.crc32c_pcl_intel_update
0.20 ± 12% -0.0 0.16 ± 7% perf-profile.self.cycles-pp.tcp_recvmsg
0.14 ± 13% -0.0 0.10 ± 21% perf-profile.self.cycles-pp.set_next_entity
0.16 ± 2% -0.0 0.14 ± 8% perf-profile.self.cycles-pp.kmem_cache_alloc
0.06 ± 20% +0.0 0.09 ± 13% perf-profile.self.cycles-pp.lapic_next_deadline
0.07 ± 13% +0.0 0.10 ± 26% perf-profile.self.cycles-pp.rcu_process_callbacks
0.10 ± 8% +0.0 0.14 ± 5% perf-profile.self.cycles-pp.irq_exit
0.06 ± 61% +0.1 0.11 ± 11% perf-profile.self.cycles-pp.cpu_load_update
0.06 ± 22% +0.1 0.11 perf-profile.self.cycles-pp.account_process_tick
0.00 +0.1 0.05 ± 9% perf-profile.self.cycles-pp.timerqueue_del
0.19 ± 13% +0.2 0.38 ± 29% perf-profile.self.cycles-pp.rcu_check_callbacks
1.33 ± 9% +0.3 1.65 ± 9% perf-profile.self.cycles-pp.menu_select
fsmark.time.system_time
275 +-+-----------------------------O------------O------------------------+
270 +-+ O |
| O |
265 O-+ O O O |
260 +-+ O O O O
| |
255 +-+ |
250 +-+ |
245 +-+ |
| |
240 +-+ |
235 +-+ |
|..... ...+..... ...+...... ...+...... ...+.....+...... ...|
230 +-+ +... +.. +.. +.. +.. |
225 +-+-------------------------------------------------------------------+
fsmark.time.percent_of_cpu_this_job_got
60 +-+--------------------------------------------------------------------+
58 +-+... ...+..... ...+.....+......+.....+......+..... ...+.....|
| +... +... +... |
56 +-+ |
54 +-+ |
| |
52 +-+ |
50 +-+ |
48 +-+ |
| |
46 +-+ |
44 +-+ |
| O O O |
42 O-+ O O O O O O O O
40 +-+--------------------------------------------------------------------+
fsmark.time.elapsed_time
700 +-+-----------------------------O------------O-----------O------------+
O O O O O O O O O
650 +-+ |
| |
| |
600 +-+ |
| |
550 +-+ |
| |
500 +-+ |
| |
| |
450 +-+ ...+...... |
|.....+......+.....+.....+......+.....+......+.....+.. +.....|
400 +-+-------------------------------------------------------------------+
fsmark.time.elapsed_time.max
700 +-+-----------------------------O------------O-----------O------------+
O O O O O O O O O
650 +-+ |
| |
| |
600 +-+ |
| |
550 +-+ |
| |
500 +-+ |
| |
| |
450 +-+ ...+...... |
|.....+......+.....+.....+......+.....+......+.....+.. +.....|
400 +-+-------------------------------------------------------------------+
fsmark.time.voluntary_context_switches
1.14e+07 +-+--------------------------------------------------------------+
| |
1.13e+07 +-+ ...+.....+.... ...+..... |
|.....+.....+.....+.. +.....+.....+.. +.....|
1.12e+07 +-+ |
1.11e+07 +-+ |
| |
1.1e+07 +-+ |
| |
1.09e+07 +-+ |
1.08e+07 +-+ |
| |
1.07e+07 O-+ O O |
| O O O O O O O O O
1.06e+07 +-+--------------------------------------------------------------+
fsmark.files_per_sec
6500 +-+------------------------------------------------------------------+
|.....+......+.....+.....+.....+......+.....+.....+..... ...+.....|
6000 +-+ +... |
| |
| |
5500 +-+ |
| |
5000 +-+ |
| |
4500 +-+ |
| |
| |
4000 O-+ O O O O O O O O O O O
| |
3500 +-+------------------------------------------------------------------+
nfsstat.Client.nfs.v3.lookup.percent
21 +-+--------------------------------------------------------------------+
| |
20 +-+ |
19 +-+ |
| |
18 +-+ O O O O O O O O O O O
17 +-+ |
| |
16 +-+ |
15 +-+ |
| |
14 +-+ |
13 +-+ |
| |
12 O-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v3.access.percent
19 +-+--------------------------------------------------------------------+
| |
18 +-+ |
17 +-+ |
| |
16 +-+ O O O O O O O O O O O
| |
15 +-+ |
| |
14 +-+ |
13 +-+ |
| |
12 +-+ |
| |
11 O-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v3.write
35 +-+--------------------------------------------------------------------+
O |
30 +-+ |
| |
| |
25 +-+ |
| |
20 +-+ |
| |
15 +-+ |
| O O O O O O O O O O O
| |
10 +-+ ...+..... |
|.....+......+.....+......+.....+... +......+.....+......+.....|
5 +-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v3.write.percent
30 +-+--------------------------------------------------------------------+
O |
28 +-+ |
| |
26 +-+ |
| |
24 +-+ |
| |
22 +-+ |
| O O O O O O O O O O O
20 +-+ |
| |
18 +-+ |
| |
16 +-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v3.commit
35 O-+--------------------------------------------------------------------+
| |
30 +-+ |
| |
| |
25 +-+ |
| |
20 +-+ |
| |
15 +-+ |
| O O O O O O O O O O O
| |
10 +-+ ...+..... |
|.....+......+.....+......+.....+... +......+.....+......+.....|
5 +-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v3.commit.percent
30 O-+--------------------------------------------------------------------+
| |
28 +-+ |
| |
26 +-+ |
| |
24 +-+ |
| |
22 +-+ |
| O O O O O O O O O O O
20 +-+ |
| |
18 +-+ |
| |
16 +-+--------------------------------------------------------------------+
nfsstat.Client.nfs.v4.getattr
32 +-+--------------------------------------------------------------------+
| O |
30 +-+ |
| O O |
28 O-+ O O O O O O O O
| |
26 +-+ |
| |
24 +-+ |
| |
22 +-+ |
| |
20 +-+ |
| ...+..... ...+...... ...|
18 +-+--------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.20.0-rc4-00014-gfde87268" of type "text/plain" (168537 bytes)
View attachment "job-script" of type "text/plain" (7815 bytes)
View attachment "job.yaml" of type "text/plain" (5365 bytes)
View attachment "reproduce" of type "text/plain" (1010 bytes)
Powered by blists - more mailing lists