[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191204095830.GZ18573@shao2-debian>
Date: Wed, 4 Dec 2019 17:58:30 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: Christoph Hellwig <hch@....de>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
lkp@...ts.01.org
Subject: [block] 344e9ffcbd: fsmark.files_per_sec 15.9% improvement
Greeting,
FYI, we noticed a 15.9% improvement of fsmark.files_per_sec due to commit:
commit: 344e9ffcbd1898e1dc04085564a6e05c30ea8199 ("block: add queue_is_mq() helper")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fsmark
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
iterations: 1x
nr_threads: 32t
disk: 1SSD
fs: xfs
filesize: 9B
test_size: 400M
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
ucode: 0x500002b
test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------+
| testcase: change | fileio: iostat.sda.wkB/s 6.1% improvement |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | disk=1HDD |
| | filenum=1024f |
| | fs=xfs |
| | iomode=sync |
| | nr_threads=100% |
| | period=600s |
| | rwmode=rndwr |
| | size=64G |
| | ucode=0xb00002e |
+------------------+----------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
gcc-7/performance/1SSD/9B/xfs/1x/x86_64-rhel-7.6/16d/256fpd/32t/debian-x86_64-2019-09-23.cgz/fsyncBeforeClose/lkp-csl-2sp7/400M/fsmark/0x500002b
commit:
dabcefab45 ("nvme: provide optimized poll function for separate poll queues")
344e9ffcbd ("block: add queue_is_mq() helper")
dabcefab45d36ecb 344e9ffcbd1898e1dc04085564a
---------------- ---------------------------
%stddev %change %stddev
\ | \
16845 +15.9% 19530 fsmark.files_per_sec
122073 -8.5% 111641 ± 2% fsmark.time.involuntary_context_switches
231.67 ± 2% +7.5% 249.00 ± 2% fsmark.time.percent_of_cpu_this_job_got
15.09 ± 2% -5.1% 14.33 ± 2% fsmark.time.system_time
542964 -2.6% 528818 fsmark.time.voluntary_context_switches
429516 ± 3% -16.7% 357836 ± 6% meminfo.DirectMap4k
2.85 ± 11% -0.5 2.36 ± 8% mpstat.cpu.all.iowait%
12.99 ± 15% +27.9% 16.61 ± 11% turbostat.CPU%c6
128324 ± 18% +15.2% 147810 ± 13% numa-numastat.node0.local_node
159384 ± 15% +12.3% 178965 ± 11% numa-numastat.node0.numa_hit
2.41 ± 11% -17.0% 2.00 ± 7% iostat.cpu.iowait
2.41 -6.0% 2.27 ± 8% iostat.sdb.avgqu-sz
0.65 ± 2% +85.2% 1.21 ± 82% iostat.sdb.w_await.max
15843 ±141% -100.0% 1.25 ±131% softirqs.CPU0.BLOCK
9608 ± 7% -10.4% 8613 ± 9% softirqs.CPU30.TIMER
10112 ± 5% -13.0% 8801 ± 4% softirqs.CPU9.TIMER
3204 ± 11% +15.4% 3697 ± 12% slabinfo.eventpoll_pwq.active_objs
3204 ± 11% +15.4% 3697 ± 12% slabinfo.eventpoll_pwq.num_objs
8830 ± 3% +16.8% 10314 ± 3% slabinfo.kmalloc-1k.active_objs
8932 ± 3% +16.7% 10423 ± 3% slabinfo.kmalloc-1k.num_objs
67172 -1.4% 66257 proc-vmstat.nr_active_anon
67077 -1.4% 66118 proc-vmstat.nr_anon_pages
36011 ± 2% -6.3% 33759 ± 4% proc-vmstat.nr_inactive_file
16948 -2.0% 16603 proc-vmstat.nr_kernel_stack
1271 -5.1% 1207 proc-vmstat.nr_page_table_pages
67172 -1.4% 66257 proc-vmstat.nr_zone_active_anon
36011 ± 2% -6.3% 33759 ± 4% proc-vmstat.nr_zone_inactive_file
1102105 +1.2% 1115271 proc-vmstat.pgpgout
47857 ± 23% -50.0% 23941 ± 52% numa-vmstat.node0.nr_active_anon
47776 ± 23% -50.1% 23834 ± 52% numa-vmstat.node0.nr_anon_pages
32174 ± 5% +28.2% 41236 ± 4% numa-vmstat.node0.nr_dirtied
33014 ± 5% +27.5% 42078 ± 4% numa-vmstat.node0.nr_written
47857 ± 23% -50.0% 23941 ± 52% numa-vmstat.node0.nr_zone_active_anon
19118 ± 55% +121.4% 42322 ± 30% numa-vmstat.node1.nr_active_anon
19132 ± 55% +121.0% 42290 ± 30% numa-vmstat.node1.nr_anon_pages
11003 ± 20% -51.7% 5320 ± 34% numa-vmstat.node1.nr_inactive_file
19118 ± 55% +121.4% 42322 ± 30% numa-vmstat.node1.nr_zone_active_anon
11003 ± 20% -51.7% 5320 ± 34% numa-vmstat.node1.nr_zone_inactive_file
192100 ± 22% -50.1% 95812 ± 52% numa-meminfo.node0.Active
192100 ± 22% -50.1% 95766 ± 52% numa-meminfo.node0.Active(anon)
116053 ± 13% -81.1% 21888 ± 87% numa-meminfo.node0.AnonHugePages
191689 ± 22% -50.3% 95338 ± 52% numa-meminfo.node0.AnonPages
101256 ± 14% +12.4% 113765 ± 11% numa-meminfo.node0.Inactive(file)
76790 ± 55% +120.6% 169420 ± 30% numa-meminfo.node1.Active
76607 ± 55% +121.0% 169282 ± 30% numa-meminfo.node1.Active(anon)
20934 ± 61% +461.7% 117580 ± 19% numa-meminfo.node1.AnonHugePages
76670 ± 55% +120.6% 169137 ± 30% numa-meminfo.node1.AnonPages
42821 ± 24% -50.3% 21274 ± 34% numa-meminfo.node1.Inactive(file)
975968 ± 3% +16.2% 1133711 ± 9% numa-meminfo.node1.MemUsed
1969 ± 9% -16.2% 1650 ± 9% sched_debug.cfs_rq:/.load.avg
5724 ± 18% -32.9% 3841 ± 20% sched_debug.cfs_rq:/.min_vruntime.stddev
13.90 ± 36% +51.9% 21.12 ± 34% sched_debug.cfs_rq:/.removed.load_avg.avg
115.12 ± 18% +23.7% 142.38 ± 17% sched_debug.cfs_rq:/.removed.load_avg.stddev
641.03 ± 37% +52.7% 978.57 ± 34% sched_debug.cfs_rq:/.removed.runnable_sum.avg
5305 ± 18% +24.3% 6596 ± 17% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
5724 ± 18% -33.1% 3826 ± 21% sched_debug.cfs_rq:/.spread0.stddev
2.02 ± 22% -27.7% 1.46 ± 7% sched_debug.cpu.cpu_load[1].avg
2.57 ± 22% -35.4% 1.66 ± 4% sched_debug.cpu.cpu_load[2].avg
7.94 ± 70% -58.3% 3.31 sched_debug.cpu.cpu_load[2].stddev
2.74 ± 17% -33.4% 1.83 ± 7% sched_debug.cpu.cpu_load[3].avg
7.38 ± 49% -54.4% 3.36 ± 6% sched_debug.cpu.cpu_load[3].stddev
2.38 ± 15% -33.6% 1.58 ± 10% sched_debug.cpu.cpu_load[4].avg
9.89 ± 70% -7.8 2.08 ±173% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.33 ± 79% -7.3 0.00 perf-profile.calltrace.cycles-pp.may_open.path_openat.do_filp_open.do_sys_open.do_syscall_64
7.33 ± 79% -7.3 0.00 perf-profile.calltrace.cycles-pp.security_inode_permission.may_open.path_openat.do_filp_open.do_sys_open
7.33 ± 79% -7.3 0.00 perf-profile.calltrace.cycles-pp.selinux_inode_permission.security_inode_permission.may_open.path_openat.do_filp_open
7.33 ± 79% -5.2 2.08 ±173% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.33 ± 79% -5.2 2.08 ±173% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.89 ± 70% -7.8 2.08 ±173% perf-profile.children.cycles-pp.do_sys_open
9.89 ± 70% -7.8 2.08 ±173% perf-profile.children.cycles-pp.do_filp_open
9.89 ± 70% -7.8 2.08 ±173% perf-profile.children.cycles-pp.path_openat
7.33 ± 79% -7.3 0.00 perf-profile.children.cycles-pp.__free_pages_ok
7.33 ± 79% -7.3 0.00 perf-profile.children.cycles-pp.free_one_page
7.33 ± 79% -7.3 0.00 perf-profile.children.cycles-pp.may_open
5.90 ± 72% -5.9 0.00 perf-profile.children.cycles-pp.__sched_text_start
5.90 ± 72% -5.9 0.00 perf-profile.children.cycles-pp.schedule
7.33 ± 79% -5.2 2.08 ±173% perf-profile.children.cycles-pp.security_inode_permission
7.33 ± 79% -5.2 2.08 ±173% perf-profile.children.cycles-pp.selinux_inode_permission
17642600 -20.2% 14085540 ± 9% perf-stat.i.cache-misses
4907 ± 2% -30.6% 3405 ± 13% perf-stat.i.cpu-migrations
952.61 +29.3% 1231 ± 14% perf-stat.i.cycles-between-cache-misses
7034 -12.4% 6163 ± 9% perf-stat.i.minor-faults
6367482 -22.7% 4919325 ± 9% perf-stat.i.node-load-misses
84.40 -6.6 77.77 ± 4% perf-stat.i.node-store-miss-rate%
2878736 -26.2% 2124511 ± 9% perf-stat.i.node-store-misses
7034 -12.4% 6163 ± 9% perf-stat.i.page-faults
15.35 -3.6 11.71 ± 12% perf-stat.overall.cache-miss-rate%
948.47 +29.6% 1229 ± 13% perf-stat.overall.cycles-between-cache-misses
0.01 ± 4% -0.0 0.01 ± 12% perf-stat.overall.dTLB-store-miss-rate%
84.41 -6.4 78.03 ± 4% perf-stat.overall.node-store-miss-rate%
13526739 ± 2% -21.7% 10587792 ± 14% perf-stat.ps.cache-misses
3761 ± 2% -32.2% 2551 ± 14% perf-stat.ps.cpu-migrations
5395 ± 3% -14.4% 4617 ± 11% perf-stat.ps.minor-faults
4882260 ± 2% -24.4% 3691428 ± 12% perf-stat.ps.node-load-misses
2207749 ± 3% -27.7% 1595213 ± 13% perf-stat.ps.node-store-misses
5395 ± 3% -14.4% 4617 ± 11% perf-stat.ps.page-faults
586.00 ± 15% +55.2% 909.25 ± 36% interrupts.CPU14.CAL:Function_call_interrupts
105.67 ± 19% +407.3% 536.00 ±130% interrupts.CPU19.RES:Rescheduling_interrupts
88.00 ± 14% -44.3% 49.00 ± 35% interrupts.CPU25.RES:Rescheduling_interrupts
88.33 ± 10% -39.2% 53.75 ± 22% interrupts.CPU28.RES:Rescheduling_interrupts
98.67 ± 41% -63.0% 36.50 ± 18% interrupts.CPU31.RES:Rescheduling_interrupts
1583 ± 8% -25.2% 1184 ± 18% interrupts.CPU33.CAL:Function_call_interrupts
91.67 ± 12% -59.9% 36.75 ± 20% interrupts.CPU33.RES:Rescheduling_interrupts
86.00 ± 16% -52.9% 40.50 ± 32% interrupts.CPU34.RES:Rescheduling_interrupts
94.00 ± 13% -50.3% 46.75 ± 34% interrupts.CPU35.RES:Rescheduling_interrupts
78.33 ± 22% -46.4% 42.00 ± 38% interrupts.CPU39.RES:Rescheduling_interrupts
86.00 ± 15% -59.0% 35.25 ± 47% interrupts.CPU40.RES:Rescheduling_interrupts
1510 ± 8% -33.1% 1010 ± 26% interrupts.CPU41.CAL:Function_call_interrupts
91.00 ± 28% -52.7% 43.00 ± 23% interrupts.CPU44.RES:Rescheduling_interrupts
81.00 ± 23% -47.5% 42.50 ± 36% interrupts.CPU74.RES:Rescheduling_interrupts
75.67 ± 30% -51.8% 36.50 ± 20% interrupts.CPU76.RES:Rescheduling_interrupts
86.33 ± 22% -57.1% 37.00 ± 23% interrupts.CPU77.RES:Rescheduling_interrupts
83.67 ± 17% -50.7% 41.25 ± 29% interrupts.CPU78.RES:Rescheduling_interrupts
92.67 ± 18% -52.8% 43.75 ± 34% interrupts.CPU79.RES:Rescheduling_interrupts
89.33 ± 12% -43.2% 50.75 ± 27% interrupts.CPU82.RES:Rescheduling_interrupts
98.00 ± 22% -52.6% 46.50 ± 19% interrupts.CPU83.RES:Rescheduling_interrupts
97.33 ± 26% -59.7% 39.25 ± 63% interrupts.CPU84.RES:Rescheduling_interrupts
320.00 ±104% -87.1% 41.25 ± 21% interrupts.CPU86.RES:Rescheduling_interrupts
88.33 ± 22% -45.1% 48.50 ± 43% interrupts.CPU88.RES:Rescheduling_interrupts
92.67 ± 15% -56.8% 40.00 ± 44% interrupts.CPU89.RES:Rescheduling_interrupts
88.67 ± 15% -57.1% 38.00 ± 33% interrupts.CPU90.RES:Rescheduling_interrupts
93.67 ± 17% -54.6% 42.50 ± 35% interrupts.CPU95.RES:Rescheduling_interrupts
fsmark.files_per_sec
25000 +-+-----------------------------------------------------------------+
| |
| |
20000 O-O O O O O O O O O O O O O O O O O O O O O O |
|.+. .+. .+. +. .+.+.+. .+. .+.+ +.+.+.+.+ |
| + +.+ +.+ + : +.+ + + + : : : +.+.|
15000 +-+ : : : : : : : : : : |
| : : : : : : : : : : |
10000 +-+ : : : : : : : : : : |
| : : : : : : : : : : |
| : : : : : : : : : : |
5000 +-+ : : : : : : : : : : |
| : : : : : |
| : : : : : |
0 +-+---------O-------O-------------------O---------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-bdw-ep3c: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/disk/filenum/fs/iomode/kconfig/nr_threads/period/rootfs/rwmode/size/tbox_group/testcase/ucode:
gcc-7/performance/1HDD/1024f/xfs/sync/x86_64-rhel-7.2/100%/600s/debian-x86_64-2018-04-03.cgz/rndwr/64G/lkp-bdw-ep3c/fileio/0xb00002e
commit:
dabcefab45 ("nvme: provide optimized poll function for separate poll queues")
344e9ffcbd ("block: add queue_is_mq() helper")
dabcefab45d36ecb 344e9ffcbd1898e1dc04085564a
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.82 ± 2% -25.5% 1.35 fileio.request_latency_avg_ms
146.73 +7.3% 157.47 fileio.requests_per_sec
11.91 ± 2% -2.4 9.46 ± 6% fileio.thread_events_stddev%
515696 ± 3% +9.4% 564189 ± 4% fileio.time.involuntary_context_switches
23.50 ± 2% +128.7% 53.75 ± 2% fileio.time.percent_of_cpu_this_job_got
142.59 ± 3% +123.8% 319.15 ± 5% fileio.time.system_time
1092 ± 4% -8.3% 1002 ± 3% meminfo.Dirty
272.25 ± 4% -8.5% 249.00 ± 3% proc-vmstat.nr_dirty
2.508e+10 ± 7% +11.3% 2.792e+10 ± 3% cpuidle.C3.time
1.291e+08 ±171% -99.0% 1315279 ± 7% cpuidle.POLL.time
18566208 ±169% -98.2% 326061 ± 8% cpuidle.POLL.usage
69.75 ± 14% +25.4% 87.50 ± 5% turbostat.Avg_MHz
4.05 ± 9% +1.1 5.11 ± 3% turbostat.Busy%
46.69 ± 7% +6.5 53.23 ± 6% turbostat.C3%
40.10 ± 5% -20.3% 31.96 iostat.cpu.idle
59.55 ± 3% +13.1% 67.37 iostat.cpu.iowait
1495 +2.7% 1536 iostat.sda.w/s
2660 +6.1% 2822 iostat.sda.wkB/s
40.00 ± 5% -8.2 31.85 mpstat.cpu.idle%
59.65 ± 3% +7.8 67.48 mpstat.cpu.iowait%
0.29 ± 5% +0.3 0.61 ± 3% mpstat.cpu.sys%
0.02 ± 11% +0.0 0.03 ± 7% mpstat.cpu.usr%
39.75 ± 5% -21.4% 31.25 vmstat.cpu.id
59.25 ± 3% +12.7% 66.75 vmstat.cpu.wa
2640 +6.1% 2801 vmstat.io.bo
7595 +5.8% 8037 vmstat.system.cs
3.002e+08 ± 8% +21.2% 3.638e+08 ± 3% perf-stat.cache-misses
3.865e+12 ± 16% +26.9% 4.903e+12 perf-stat.cpu-cycles
88656 ± 8% +68.8% 149686 ± 9% perf-stat.cpu-migrations
1.493e+08 ± 8% +19.4% 1.782e+08 ± 2% perf-stat.node-load-misses
44.19 ± 5% +7.2 51.37 ± 4% perf-stat.node-store-miss-rate%
39430465 ± 11% +48.0% 58375753 ± 8% perf-stat.node-store-misses
49645857 ± 7% +10.9% 55057560 ± 2% perf-stat.node-stores
59410 ± 5% +17.5% 69831 ± 6% softirqs.CPU10.RCU
57650 ± 2% +13.0% 65170 ± 5% softirqs.CPU13.RCU
58102 ± 17% +24.0% 72018 ± 6% softirqs.CPU14.RCU
42002 ± 6% +6.2% 44625 ± 5% softirqs.CPU16.RCU
56696 +16.7% 66176 ± 7% softirqs.CPU2.RCU
61052 ± 5% +16.4% 71041 ± 6% softirqs.CPU3.RCU
47619 ± 13% +15.1% 54819 ± 5% softirqs.CPU39.RCU
57484 ± 7% +17.0% 67267 ± 9% softirqs.CPU46.RCU
61275 ± 4% +15.3% 70680 ± 6% softirqs.CPU47.RCU
61345 ± 9% +14.2% 70038 ± 9% softirqs.CPU53.RCU
57258 ± 8% +11.6% 63882 ± 9% softirqs.CPU57.RCU
69231 ± 6% -17.5% 57085 ± 2% softirqs.CPU65.SCHED
56920 ± 6% +12.6% 64075 ± 7% softirqs.CPU7.RCU
58671 ± 12% +18.0% 69255 ± 4% softirqs.CPU86.RCU
1121 ± 6% +84.8% 2073 ± 2% sched_debug.cfs_rq:/.exec_clock.avg
572.89 ± 31% +168.5% 1538 ± 8% sched_debug.cfs_rq:/.exec_clock.min
36052 ±126% -77.6% 8067 ± 9% sched_debug.cfs_rq:/.load.avg
2659824 ±146% -86.4% 361376 sched_debug.cfs_rq:/.load.max
298229 ±139% -83.4% 49377 ± 4% sched_debug.cfs_rq:/.load.stddev
7682 ± 7% +27.0% 9759 ± 11% sched_debug.cfs_rq:/.min_vruntime.avg
2265 ± 36% +52.3% 3449 ± 30% sched_debug.cfs_rq:/.min_vruntime.min
5.65 ± 8% -19.8% 4.53 ± 8% sched_debug.cfs_rq:/.runnable_load_avg.avg
391.08 ± 4% -10.2% 351.28 sched_debug.cfs_rq:/.runnable_load_avg.max
44.46 ± 4% -14.4% 38.06 ± 3% sched_debug.cfs_rq:/.runnable_load_avg.stddev
36043 ±126% -77.9% 7981 ± 10% sched_debug.cfs_rq:/.runnable_weight.avg
2659107 ±146% -86.5% 359609 ± 2% sched_debug.cfs_rq:/.runnable_weight.max
298182 ±139% -83.5% 49072 ± 5% sched_debug.cfs_rq:/.runnable_weight.stddev
387.93 ± 4% -9.4% 351.32 sched_debug.cpu.cpu_load[0].max
84205 ± 51% -90.3% 8137 ± 11% sched_debug.cpu.load.avg
6983828 ± 54% -94.5% 384493 ± 11% sched_debug.cpu.load.max
751762 ± 53% -93.2% 51250 ± 10% sched_debug.cpu.load.stddev
13045 ± 30% +48.5% 19371 ± 9% sched_debug.cpu.nr_switches.min
30.30 ± 20% +298.9% 120.86 ± 17% sched_debug.cpu.nr_uninterruptible.max
-49.51 +2144.0% -1110 sched_debug.cpu.nr_uninterruptible.min
10.55 ± 15% +1050.5% 121.42 ± 15% sched_debug.cpu.nr_uninterruptible.stddev
126088 ± 11% -19.4% 101651 ± 7% sched_debug.cpu.sched_count.max
12664 ± 31% +49.8% 18977 ± 8% sched_debug.cpu.sched_count.min
14752 ± 13% -27.6% 10686 ± 13% sched_debug.cpu.sched_count.stddev
3236 ± 17% +39.3% 4509 ± 7% sched_debug.cpu.ttwu_count.min
2314 ± 20% +69.2% 3916 ± 3% sched_debug.cpu.ttwu_local.min
4.25 ± 30% -2.6 1.60 ± 30% perf-profile.calltrace.cycles-pp.__hrtimer_get_next_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
1.76 ± 30% -1.2 0.54 ± 62% perf-profile.calltrace.cycles-pp.__hrtimer_next_event_base.__hrtimer_get_next_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
0.00 +1.0 0.97 ± 9% perf-profile.calltrace.cycles-pp.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.do_fsync
0.00 +1.0 0.97 ± 9% perf-profile.calltrace.cycles-pp.generic_make_request.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync
0.00 +1.0 0.97 ± 9% perf-profile.calltrace.cycles-pp.blk_mq_make_request.generic_make_request.submit_bio.submit_bio_wait.blkdev_issue_flush
0.00 +1.0 0.98 ± 9% perf-profile.calltrace.cycles-pp.blkdev_issue_flush.xfs_file_fsync.do_fsync.__x64_sys_fsync.do_syscall_64
0.00 +1.0 0.98 ± 9% perf-profile.calltrace.cycles-pp.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.do_fsync.__x64_sys_fsync
0.00 +1.0 1.04 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.0 1.04 ± 10% perf-profile.calltrace.cycles-pp.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.0 1.04 ± 10% perf-profile.calltrace.cycles-pp.xfs_file_fsync.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.1 1.06 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +1.1 1.06 ± 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
35.75 ± 20% +17.2 52.96 ± 6% perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
51.24 ± 44% +18.7 69.89 ± 5% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
42.59 ± 25% +26.0 68.58 ± 8% perf-profile.calltrace.cycles-pp.read_tsc.ktime_get.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt
5.60 ± 46% -2.7 2.86 ± 13% perf-profile.children.cycles-pp.__hrtimer_next_event_base
4.34 ± 31% -2.6 1.70 ± 29% perf-profile.children.cycles-pp.__hrtimer_get_next_event
0.49 ± 62% -0.3 0.20 ± 14% perf-profile.children.cycles-pp.rcu_nmi_enter
0.15 ± 12% -0.0 0.11 ± 15% perf-profile.children.cycles-pp.__set_pte_vaddr
0.23 ± 8% -0.0 0.20 ± 7% perf-profile.children.cycles-pp.set_pte_vaddr
0.01 ±173% +0.1 0.08 ± 33% perf-profile.children.cycles-pp.__sched_text_start
0.01 ±173% +0.1 0.08 ± 33% perf-profile.children.cycles-pp.finish_task_switch
0.32 ± 12% +0.1 0.47 ± 13% perf-profile.children.cycles-pp.rcu_nmi_exit
0.95 ± 9% +0.2 1.11 ± 5% perf-profile.children.cycles-pp.native_apic_msr_write
0.13 ± 36% +0.3 0.47 ± 14% perf-profile.children.cycles-pp.blk_insert_flush
0.10 ± 39% +0.4 0.51 ± 10% perf-profile.children.cycles-pp.kblockd_mod_delayed_work_on
0.10 ± 39% +0.4 0.51 ± 10% perf-profile.children.cycles-pp.mod_delayed_work_on
0.10 ± 39% +0.4 0.52 ± 11% perf-profile.children.cycles-pp.blk_mq_run_hw_queue
0.08 ± 67% +0.4 0.49 ± 9% perf-profile.children.cycles-pp.try_to_grab_pending
0.24 ± 32% +0.7 0.98 ± 9% perf-profile.children.cycles-pp.blkdev_issue_flush
0.24 ± 32% +0.7 0.98 ± 9% perf-profile.children.cycles-pp.submit_bio_wait
0.24 ± 34% +0.8 0.99 ± 9% perf-profile.children.cycles-pp.submit_bio
0.24 ± 34% +0.8 0.99 ± 9% perf-profile.children.cycles-pp.generic_make_request
0.24 ± 35% +0.8 0.99 ± 10% perf-profile.children.cycles-pp.blk_mq_make_request
0.26 ± 31% +0.8 1.04 ± 10% perf-profile.children.cycles-pp.__x64_sys_fsync
0.26 ± 31% +0.8 1.04 ± 10% perf-profile.children.cycles-pp.do_fsync
0.26 ± 31% +0.8 1.04 ± 10% perf-profile.children.cycles-pp.xfs_file_fsync
0.30 ± 30% +0.8 1.11 ± 10% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.30 ± 30% +0.8 1.11 ± 10% perf-profile.children.cycles-pp.do_syscall_64
27.17 ± 17% +13.6 40.79 ± 7% perf-profile.children.cycles-pp.read_tsc
36.11 ± 11% +14.6 50.68 ± 6% perf-profile.children.cycles-pp.ktime_get
0.94 ± 47% -0.4 0.50 ± 10% perf-profile.self.cycles-pp.__hrtimer_next_event_base
0.49 ± 62% -0.3 0.20 ± 14% perf-profile.self.cycles-pp.rcu_nmi_enter
0.40 ± 33% -0.2 0.17 ± 23% perf-profile.self.cycles-pp.__hrtimer_get_next_event
0.41 ± 57% -0.2 0.18 ± 40% perf-profile.self.cycles-pp.__hrtimer_run_queues
0.15 ± 12% -0.0 0.11 ± 15% perf-profile.self.cycles-pp.__set_pte_vaddr
0.23 ± 8% -0.0 0.20 ± 7% perf-profile.self.cycles-pp.set_pte_vaddr
0.32 ± 12% +0.1 0.47 ± 13% perf-profile.self.cycles-pp.rcu_nmi_exit
0.95 ± 9% +0.2 1.11 ± 5% perf-profile.self.cycles-pp.native_apic_msr_write
5.95 ± 22% +2.0 7.97 ± 7% perf-profile.self.cycles-pp.read_tsc
577915 +5.0% 606855 ± 2% interrupts.CAL:Function_call_interrupts
241.50 ± 34% +146.9% 596.25 ± 16% interrupts.CPU0.TLB:TLB_shootdowns
258.50 ± 45% +149.3% 644.50 ± 15% interrupts.CPU1.TLB:TLB_shootdowns
217.75 ± 39% +158.2% 562.25 ± 9% interrupts.CPU10.TLB:TLB_shootdowns
444.25 ± 15% +21.8% 541.00 ± 14% interrupts.CPU11.RES:Rescheduling_interrupts
221.50 ± 35% +237.2% 747.00 ± 19% interrupts.CPU11.TLB:TLB_shootdowns
250.75 ± 36% +121.1% 554.50 ± 11% interrupts.CPU12.TLB:TLB_shootdowns
211.00 ± 36% +177.1% 584.75 ± 11% interrupts.CPU13.TLB:TLB_shootdowns
229.75 ± 28% +153.6% 582.75 ± 16% interrupts.CPU14.TLB:TLB_shootdowns
406.75 ± 6% +67.9% 683.00 ± 39% interrupts.CPU15.RES:Rescheduling_interrupts
257.50 ± 25% +144.4% 629.25 ± 26% interrupts.CPU15.TLB:TLB_shootdowns
251.75 ± 25% +133.3% 587.25 ± 14% interrupts.CPU16.TLB:TLB_shootdowns
259.75 ± 17% +131.5% 601.25 ± 9% interrupts.CPU17.TLB:TLB_shootdowns
235.00 ± 19% +186.1% 672.25 ± 10% interrupts.CPU18.TLB:TLB_shootdowns
284.75 ± 35% +103.9% 580.50 ± 18% interrupts.CPU19.TLB:TLB_shootdowns
251.50 ± 24% +157.3% 647.00 ± 27% interrupts.CPU2.TLB:TLB_shootdowns
212.25 ± 31% +189.3% 614.00 ± 8% interrupts.CPU20.TLB:TLB_shootdowns
164.50 ± 37% +259.6% 591.50 ± 23% interrupts.CPU22.TLB:TLB_shootdowns
6456 ± 11% +16.6% 7528 ± 3% interrupts.CPU23.CAL:Function_call_interrupts
174.25 ± 35% +236.0% 585.50 ± 15% interrupts.CPU23.TLB:TLB_shootdowns
182.50 ± 58% +218.2% 580.75 ± 15% interrupts.CPU24.TLB:TLB_shootdowns
193.50 ± 10% +170.4% 523.25 ± 19% interrupts.CPU25.TLB:TLB_shootdowns
115.00 ± 29% +279.3% 436.25 ± 16% interrupts.CPU26.TLB:TLB_shootdowns
3137 ± 6% +135.7% 7395 ± 21% interrupts.CPU27.NMI:Non-maskable_interrupts
3137 ± 6% +135.7% 7395 ± 21% interrupts.CPU27.PMI:Performance_monitoring_interrupts
128.25 ± 32% +353.4% 581.50 ± 23% interrupts.CPU27.TLB:TLB_shootdowns
114.25 ± 21% +308.3% 466.50 ± 18% interrupts.CPU28.TLB:TLB_shootdowns
153.50 ± 38% +226.1% 500.50 ± 22% interrupts.CPU29.TLB:TLB_shootdowns
487.50 ± 11% +38.1% 673.00 ± 7% interrupts.CPU3.RES:Rescheduling_interrupts
269.25 ± 42% +123.5% 601.75 ± 14% interrupts.CPU3.TLB:TLB_shootdowns
179.75 ± 47% +223.9% 582.25 ± 33% interrupts.CPU30.TLB:TLB_shootdowns
246.00 ± 37% +111.0% 519.00 ± 18% interrupts.CPU31.TLB:TLB_shootdowns
6956 +6.2% 7388 ± 3% interrupts.CPU32.CAL:Function_call_interrupts
179.00 ± 28% +165.4% 475.00 ± 21% interrupts.CPU32.TLB:TLB_shootdowns
283.50 ± 12% +93.4% 548.25 ± 28% interrupts.CPU33.TLB:TLB_shootdowns
153.50 ± 17% +210.9% 477.25 ± 20% interrupts.CPU35.TLB:TLB_shootdowns
3439 ± 44% +52.1% 5230 ± 25% interrupts.CPU36.NMI:Non-maskable_interrupts
3439 ± 44% +52.1% 5230 ± 25% interrupts.CPU36.PMI:Performance_monitoring_interrupts
276.75 ± 14% +31.9% 365.00 ± 7% interrupts.CPU36.RES:Rescheduling_interrupts
190.50 ± 36% +151.7% 479.50 ± 24% interrupts.CPU37.TLB:TLB_shootdowns
268.25 ± 6% +33.6% 358.50 ± 12% interrupts.CPU38.RES:Rescheduling_interrupts
153.75 ± 50% +278.4% 581.75 ± 28% interrupts.CPU38.TLB:TLB_shootdowns
155.25 ± 43% +256.0% 552.75 ± 14% interrupts.CPU39.TLB:TLB_shootdowns
527.75 ± 12% +25.7% 663.50 ± 17% interrupts.CPU4.RES:Rescheduling_interrupts
165.50 ± 39% +201.1% 498.25 ± 16% interrupts.CPU40.TLB:TLB_shootdowns
183.00 ± 61% +194.1% 538.25 ± 22% interrupts.CPU41.TLB:TLB_shootdowns
266.75 ± 33% +70.1% 453.75 ± 6% interrupts.CPU42.TLB:TLB_shootdowns
416.25 ± 13% +24.7% 519.25 ± 7% interrupts.CPU43.RES:Rescheduling_interrupts
130.00 ± 20% +386.7% 632.75 ± 18% interrupts.CPU43.TLB:TLB_shootdowns
499.00 ± 12% +45.5% 726.25 ± 24% interrupts.CPU44.RES:Rescheduling_interrupts
284.75 ± 23% +137.3% 675.75 ± 26% interrupts.CPU44.TLB:TLB_shootdowns
448.50 ± 8% +25.5% 562.75 ± 5% interrupts.CPU45.RES:Rescheduling_interrupts
276.00 ± 39% +148.3% 685.25 ± 35% interrupts.CPU45.TLB:TLB_shootdowns
357.00 ± 49% +94.5% 694.25 ± 17% interrupts.CPU46.TLB:TLB_shootdowns
352.00 ± 21% +116.4% 761.75 ± 18% interrupts.CPU47.TLB:TLB_shootdowns
288.50 ± 40% +143.0% 701.00 ± 16% interrupts.CPU48.TLB:TLB_shootdowns
402.50 ± 11% +29.9% 523.00 ± 12% interrupts.CPU49.RES:Rescheduling_interrupts
244.50 ± 20% +130.7% 564.00 ± 13% interrupts.CPU49.TLB:TLB_shootdowns
195.50 ± 43% +210.5% 607.00 ± 15% interrupts.CPU5.TLB:TLB_shootdowns
266.00 ± 33% +173.6% 727.75 ± 11% interrupts.CPU50.TLB:TLB_shootdowns
434.50 ± 9% +20.3% 522.75 ± 9% interrupts.CPU51.RES:Rescheduling_interrupts
241.25 ± 49% +208.1% 743.25 ± 25% interrupts.CPU51.TLB:TLB_shootdowns
230.50 ± 39% +205.4% 704.00 ± 23% interrupts.CPU52.TLB:TLB_shootdowns
228.50 ± 36% +236.8% 769.50 ± 19% interrupts.CPU53.TLB:TLB_shootdowns
245.75 ± 23% +131.6% 569.25 ± 27% interrupts.CPU54.TLB:TLB_shootdowns
294.75 ± 23% +157.8% 760.00 ± 22% interrupts.CPU55.TLB:TLB_shootdowns
265.00 ± 21% +160.3% 689.75 ± 15% interrupts.CPU56.TLB:TLB_shootdowns
399.25 ± 13% +27.8% 510.25 ± 10% interrupts.CPU57.RES:Rescheduling_interrupts
235.25 ± 34% +179.6% 657.75 ± 12% interrupts.CPU57.TLB:TLB_shootdowns
428.00 ± 8% +23.8% 529.75 ± 7% interrupts.CPU58.RES:Rescheduling_interrupts
334.00 ± 31% +79.6% 600.00 ± 16% interrupts.CPU58.TLB:TLB_shootdowns
381.75 ± 12% +37.3% 524.25 ± 10% interrupts.CPU59.RES:Rescheduling_interrupts
258.75 ± 19% +164.6% 684.75 ± 16% interrupts.CPU59.TLB:TLB_shootdowns
262.75 ± 28% +147.7% 650.75 ± 24% interrupts.CPU6.TLB:TLB_shootdowns
261.25 ± 30% +167.1% 697.75 ± 21% interrupts.CPU60.TLB:TLB_shootdowns
243.50 ± 35% +179.7% 681.00 ± 8% interrupts.CPU61.TLB:TLB_shootdowns
421.25 ± 5% +20.9% 509.50 ± 3% interrupts.CPU62.RES:Rescheduling_interrupts
229.00 ± 25% +207.6% 704.50 ± 23% interrupts.CPU62.TLB:TLB_shootdowns
246.50 ± 46% +172.4% 671.50 ± 8% interrupts.CPU63.TLB:TLB_shootdowns
236.00 ± 28% +182.0% 665.50 ± 17% interrupts.CPU64.TLB:TLB_shootdowns
317.25 ± 25% +109.1% 663.50 ± 12% interrupts.CPU65.TLB:TLB_shootdowns
203.50 ± 30% +154.9% 518.75 ± 42% interrupts.CPU66.TLB:TLB_shootdowns
133.50 ± 53% +234.3% 446.25 ± 30% interrupts.CPU68.TLB:TLB_shootdowns
182.25 ± 17% +202.5% 551.25 ± 18% interrupts.CPU69.TLB:TLB_shootdowns
290.50 ± 20% +138.0% 691.25 ± 21% interrupts.CPU7.TLB:TLB_shootdowns
175.25 ± 34% +137.1% 415.50 ± 33% interrupts.CPU70.TLB:TLB_shootdowns
203.50 ± 30% +133.4% 475.00 ± 6% interrupts.CPU71.TLB:TLB_shootdowns
126.25 ± 67% +303.4% 509.25 ± 24% interrupts.CPU72.TLB:TLB_shootdowns
165.00 ± 16% +186.8% 473.25 ± 27% interrupts.CPU73.TLB:TLB_shootdowns
177.00 ± 51% +232.9% 589.25 ± 20% interrupts.CPU74.TLB:TLB_shootdowns
188.75 ± 47% +155.2% 481.75 ± 25% interrupts.CPU75.TLB:TLB_shootdowns
181.75 ± 42% +278.3% 687.50 ± 18% interrupts.CPU76.TLB:TLB_shootdowns
218.50 ± 18% +146.3% 538.25 ± 10% interrupts.CPU77.TLB:TLB_shootdowns
186.50 ± 57% +201.5% 562.25 ± 20% interrupts.CPU78.TLB:TLB_shootdowns
232.00 ± 26% +163.0% 610.25 ± 27% interrupts.CPU8.TLB:TLB_shootdowns
181.75 ± 46% +143.6% 442.75 ± 37% interrupts.CPU80.TLB:TLB_shootdowns
183.75 ± 46% +177.6% 510.00 ± 28% interrupts.CPU81.TLB:TLB_shootdowns
131.75 ± 49% +302.3% 530.00 ± 25% interrupts.CPU82.TLB:TLB_shootdowns
239.50 ± 25% +85.2% 443.50 ± 23% interrupts.CPU83.TLB:TLB_shootdowns
145.75 ± 37% +315.3% 605.25 ± 18% interrupts.CPU84.TLB:TLB_shootdowns
136.50 ± 25% +313.6% 564.50 ± 24% interrupts.CPU85.TLB:TLB_shootdowns
267.00 ± 16% +42.2% 379.75 ± 8% interrupts.CPU86.RES:Rescheduling_interrupts
255.25 ± 23% +82.2% 465.00 ± 33% interrupts.CPU86.TLB:TLB_shootdowns
259.00 ± 15% +46.5% 379.50 ± 3% interrupts.CPU87.RES:Rescheduling_interrupts
189.50 ± 32% +209.2% 586.00 ± 12% interrupts.CPU87.TLB:TLB_shootdowns
443.25 +30.9% 580.00 ± 13% interrupts.CPU9.RES:Rescheduling_interrupts
251.25 ± 21% +108.2% 523.00 ± 26% interrupts.CPU9.TLB:TLB_shootdowns
19144 ± 22% +165.4% 50805 ± 15% interrupts.TLB:TLB_shootdowns
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.20.0-rc1-00216-g344e9ffcbd189" of type "text/plain" (186385 bytes)
View attachment "job-script" of type "text/plain" (8158 bytes)
View attachment "job.yaml" of type "text/plain" (5804 bytes)
View attachment "reproduce" of type "text/plain" (1057 bytes)
Powered by blists - more mailing lists