[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20191104084719.GN29418@shao2-debian>
Date: Mon, 4 Nov 2019 16:47:19 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
Cc: David Howells <dhowells@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, lkp@...ts.01.org
Subject: [pipe] 975832d6ec: hackbench.throughput 15.8% improvement
Greeting,
FYI, we noticed a 15.8% improvement of hackbench.throughput due to commit:
commit: 975832d6ecbe123dfee907af0c77cd7e0c1ad175 ("[PATCH] pipe: wakeup writer only if pipe buffer is at least half empty")
url: https://github.com/0day-ci/linux/commits/Konstantin-Khlebnikov/pipe-wakeup-writer-only-if-pipe-buffer-is-at-least-half-empty/20191030-030850
in testcase: hackbench
on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory
with following parameters:
nr_threads: 100%
mode: process
ipc: pipe
cpufreq_governor: performance
ucode: 0x21
test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler.
test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/ipc/kconfig/mode/nr_threads/rootfs/tbox_group/testcase/ucode:
gcc-7/performance/pipe/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-09-23.cgz/lkp-ivb-d04/hackbench/0x21
commit:
23fdb198ae (" fuse fixes for 5.4-rc6")
975832d6ec ("pipe: wakeup writer only if pipe buffer is at least half empty")
23fdb198ae81f47a 975832d6ecbe123dfee907af0c7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at_ip__fsnotify_parent/0x
%stddev %change %stddev
\ | \
12531 +15.8% 14513 hackbench.throughput
40251249 ± 2% -16.4% 33639762 hackbench.time.involuntary_context_switches
167144 ± 3% +12.7% 188407 hackbench.time.minor_page_faults
1939 ± 2% -5.8% 1825 hackbench.time.system_time
538.17 ± 2% +9.3% 588.24 hackbench.time.user_time
1.992e+08 ± 2% -43.9% 1.117e+08 hackbench.time.voluntary_context_switches
80400000 ± 2% +13.4% 91200000 hackbench.workload
0.58 ± 15% +0.7 1.25 ± 17% mpstat.cpu.all.idle%
21.82 +2.5 24.33 mpstat.cpu.all.usr%
77.00 -3.9% 74.00 vmstat.cpu.sy
21.00 +14.3% 24.00 vmstat.cpu.us
76.50 -17.0% 63.50 vmstat.procs.r
385197 -37.8% 239773 vmstat.system.cs
57048 -13.5% 49351 vmstat.system.in
10639420 ± 2% +60.4% 17068112 proc-vmstat.numa_hit
10639420 ± 2% +60.4% 17068112 proc-vmstat.numa_local
7986332 ± 2% +71.4% 13688588 proc-vmstat.pgalloc_dma32
2707478 ± 12% +26.8% 3432394 ± 6% proc-vmstat.pgalloc_normal
842560 ± 2% +2.4% 863109 proc-vmstat.pgfault
10678880 ± 2% +60.2% 17105823 proc-vmstat.pgfree
5349521 +77.8% 9509156 cpuidle.C1.time
946479 ± 7% +377.5% 4519814 ± 3% cpuidle.C1E.time
18613 ± 5% +307.8% 75904 ± 2% cpuidle.C1E.usage
6190 ± 60% +221.2% 19882 ± 35% cpuidle.C3.usage
3828720 ± 42% +131.5% 8861748 ± 41% cpuidle.C6.time
6799 ± 31% +177.1% 18841 ± 38% cpuidle.C6.usage
1648523 ± 2% +31.3% 2163773 ± 3% cpuidle.POLL.time
346.75 ± 10% -34.9% 225.75 ± 57% interrupts.30:PCI-MSI.512000-edge.ahci[0000:00:1f.2]
7673908 ± 2% -17.5% 6327149 interrupts.CPU0.RES:Rescheduling_interrupts
346.75 ± 10% -34.9% 225.75 ± 57% interrupts.CPU1.30:PCI-MSI.512000-edge.ahci[0000:00:1f.2]
7660843 ± 2% -16.7% 6379521 interrupts.CPU1.RES:Rescheduling_interrupts
7623517 ± 2% -17.0% 6327587 ± 2% interrupts.CPU2.RES:Rescheduling_interrupts
7659300 ± 2% -18.1% 6274802 ± 2% interrupts.CPU3.RES:Rescheduling_interrupts
30617570 ± 2% -17.3% 25309060 interrupts.RES:Rescheduling_interrupts
88.00 ± 15% +79.5% 158.00 ± 5% interrupts.TLB:TLB_shootdowns
0.21 ± 2% +0.2 0.39 ± 2% turbostat.C1%
18597 ± 5% +308.1% 75901 ± 2% turbostat.C1E
0.04 +0.1 0.18 ± 2% turbostat.C1E%
6181 ± 60% +221.6% 19881 ± 35% turbostat.C3
6773 ± 31% +178.0% 18831 ± 38% turbostat.C6
0.15 ± 42% +0.2 0.36 ± 41% turbostat.C6%
0.33 +119.7% 0.72 turbostat.CPU%c1
15.76 -0.9% 15.62 turbostat.CorWatt
35950496 ± 2% -15.0% 30544313 turbostat.IRQ
23150 ± 9% -62.1% 8785 ± 57% sched_debug.cfs_rq:/.spread0.max
130.98 ± 5% -10.9% 116.76 ± 4% sched_debug.cfs_rq:/.util_avg.stddev
100398 ± 4% +14.7% 115186 ± 5% sched_debug.cpu.avg_idle.avg
151034 ± 11% +19.3% 180111 ± 4% sched_debug.cpu.avg_idle.max
3.32 ± 12% +167.7% 8.89 ± 31% sched_debug.cpu.clock.stddev
3.32 ± 12% +167.8% 8.89 ± 31% sched_debug.cpu.clock_task.stddev
28545103 ± 4% -37.1% 17951097 sched_debug.cpu.nr_switches.avg
28918032 ± 3% -37.0% 18220304 sched_debug.cpu.nr_switches.max
28172788 ± 4% -37.1% 17713695 sched_debug.cpu.nr_switches.min
459.43 ± 44% +52.6% 700.91 ± 5% sched_debug.cpu.nr_uninterruptible.max
-496.04 +56.5% -776.18 sched_debug.cpu.nr_uninterruptible.min
362.64 ± 33% +65.2% 599.21 ± 15% sched_debug.cpu.nr_uninterruptible.stddev
7.44 -22.3% 5.78 ± 2% perf-stat.i.MPKI
1.2e+09 -1.9% 1.177e+09 perf-stat.i.branch-instructions
1.69 -0.1 1.59 perf-stat.i.branch-miss-rate%
20294379 -7.9% 18684653 perf-stat.i.branch-misses
3.51 ± 2% +1.7 5.26 ± 3% perf-stat.i.cache-miss-rate%
1541582 ± 2% +7.3% 1653979 ± 3% perf-stat.i.cache-misses
45521049 -24.9% 34206803 perf-stat.i.cache-references
386071 -37.7% 240515 perf-stat.i.context-switches
3250 -30.3% 2267 ± 2% perf-stat.i.cpu-migrations
8769 ± 3% -5.9% 8256 perf-stat.i.cycles-between-cache-misses
0.58 +0.0 0.61 perf-stat.i.dTLB-store-miss-rate%
7193442 +8.1% 7774733 perf-stat.i.dTLB-store-misses
94.20 -1.0 93.23 perf-stat.i.iTLB-load-miss-rate%
5379474 -25.0% 4037172 perf-stat.i.iTLB-load-misses
333420 -11.1% 296309 ± 5% perf-stat.i.iTLB-loads
6.142e+09 -1.3% 6.059e+09 perf-stat.i.instructions
1167 +36.3% 1591 perf-stat.i.instructions-per-iTLB-miss
1299 +4.5% 1357 perf-stat.i.minor-faults
1299 +4.5% 1357 perf-stat.i.page-faults
7.41 -23.8% 5.65 perf-stat.overall.MPKI
1.69 -0.1 1.59 perf-stat.overall.branch-miss-rate%
3.39 ± 2% +1.4 4.84 ± 3% perf-stat.overall.cache-miss-rate%
8464 ± 2% -7.3% 7845 ± 3% perf-stat.overall.cycles-between-cache-misses
0.58 +0.0 0.62 perf-stat.overall.dTLB-store-miss-rate%
94.16 -1.0 93.16 perf-stat.overall.iTLB-load-miss-rate%
1141 +31.5% 1500 perf-stat.overall.instructions-per-iTLB-miss
47974 -14.5% 41000 perf-stat.overall.path-length
1.198e+09 -1.9% 1.176e+09 perf-stat.ps.branch-instructions
20275756 -7.9% 18672032 perf-stat.ps.branch-misses
1540155 ± 2% +7.3% 1652879 ± 3% perf-stat.ps.cache-misses
45479055 -24.8% 34182981 perf-stat.ps.cache-references
385715 -37.7% 240346 perf-stat.ps.context-switches
3247 -30.2% 2265 ± 2% perf-stat.ps.cpu-migrations
7186793 +8.1% 7769429 perf-stat.ps.dTLB-store-misses
1.239e+09 +1.1% 1.252e+09 perf-stat.ps.dTLB-stores
5374507 -24.9% 4034360 perf-stat.ps.iTLB-load-misses
333114 -11.1% 296095 ± 5% perf-stat.ps.iTLB-loads
6.136e+09 -1.3% 6.055e+09 perf-stat.ps.instructions
1298 +4.5% 1356 perf-stat.ps.minor-faults
1298 +4.5% 1356 perf-stat.ps.page-faults
10.51 ± 5% -4.7 5.82 ± 13% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write
10.15 ± 5% -4.6 5.52 ± 14% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write
9.91 ± 5% -4.5 5.41 ± 14% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
25.84 ± 2% -4.2 21.59 perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
24.75 ± 2% -4.1 20.67 perf-profile.calltrace.cycles-pp.pipe_write.new_sync_write.vfs_write.ksys_write.do_syscall_64
12.47 ± 5% -3.7 8.72 ± 7% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write.ksys_write
9.65 ± 4% -3.6 6.04 ± 16% perf-profile.calltrace.cycles-pp.pipe_wait.pipe_read.new_sync_read.vfs_read.ksys_read
20.64 -3.2 17.45 ± 5% perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.64 -3.2 16.45 ± 5% perf-profile.calltrace.cycles-pp.pipe_read.new_sync_read.vfs_read.ksys_read.do_syscall_64
8.63 ± 3% -3.2 5.44 ± 15% perf-profile.calltrace.cycles-pp.schedule.pipe_wait.pipe_read.new_sync_read.vfs_read
8.49 ± 3% -3.1 5.36 ± 15% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_wait.pipe_read.new_sync_read
3.36 ± 11% -2.8 0.57 ± 59% perf-profile.calltrace.cycles-pp.pipe_wait.pipe_write.new_sync_write.vfs_write.ksys_write
2.37 ± 10% -1.9 0.45 ± 59% perf-profile.calltrace.cycles-pp.schedule.pipe_wait.pipe_write.new_sync_write.vfs_write
2.32 ± 10% -1.9 0.44 ± 59% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_wait.pipe_write.new_sync_write
3.86 ± 5% -1.7 2.19 ± 11% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
3.82 ± 5% -1.7 2.16 ± 12% perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
3.69 ± 6% -1.6 2.09 ± 11% perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
2.86 ± 5% -1.1 1.79 ± 16% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_wait.pipe_read
1.95 ± 3% -1.0 0.91 ± 25% perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
2.27 ± 5% -1.0 1.24 ± 14% perf-profile.calltrace.cycles-pp.select_task_rq_fair.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
1.79 ± 4% -1.0 0.82 ± 27% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common
1.86 ± 6% -0.8 1.05 ± 14% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up
1.76 ± 5% -0.8 0.97 ± 17% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.try_to_wake_up.autoremove_wake_function.__wake_up_common
1.20 ± 22% -0.5 0.67 ± 12% perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.16 ± 22% -0.5 0.66 ± 12% perf-profile.calltrace.cycles-pp.schedule.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.57 ± 15% -0.5 1.07 ± 15% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.pipe_wait.pipe_read
1.13 ± 22% -0.5 0.64 ± 10% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.27 ± 11% -0.4 0.88 ± 11% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_wait.pipe_read
1.13 ± 13% -0.4 0.78 ± 16% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_wait
2.46 ± 2% +0.2 2.63 ± 2% perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.98 ± 16% +0.2 1.17 ± 3% perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.new_sync_write.vfs_write.ksys_write
0.43 ± 58% +0.2 0.67 ± 6% perf-profile.calltrace.cycles-pp.current_time.file_update_time.pipe_write.new_sync_write.vfs_write
0.70 ± 15% +0.3 0.97 ± 7% perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.vfs_read.ksys_read
0.94 ± 17% +0.3 1.22 ± 8% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter.pipe_write.new_sync_write
0.86 ± 10% +0.3 1.14 ± 9% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.98 ± 14% +0.3 1.27 ± 8% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.new_sync_read.vfs_read.ksys_read
0.99 ± 18% +0.3 1.29 ± 7% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter.pipe_write.new_sync_write.vfs_write
0.47 ± 58% +0.3 0.77 ± 6% perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.vfs_write.ksys_write
0.93 ± 10% +0.3 1.26 ± 11% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.97 ± 13% +0.3 1.31 ± 4% perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_write.ksys_write.do_syscall_64
1.33 ± 14% +0.3 1.67 ± 3% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.new_sync_write.vfs_write.ksys_write
0.28 ±100% +0.4 0.65 ± 7% perf-profile.calltrace.cycles-pp.__might_fault.copy_page_to_iter.pipe_read.new_sync_read.vfs_read
3.42 ± 5% +0.4 3.84 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.new_sync_read.vfs_read.ksys_read
1.03 ± 21% +0.4 1.46 ± 9% perf-profile.calltrace.cycles-pp.fsnotify.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.10 ± 15% +0.4 1.53 ± 5% perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.46 ± 57% +0.5 0.92 ± 5% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.new_sync_write.vfs_write.ksys_write
2.24 ± 3% +0.5 2.71 ± 6% perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
1.00 ± 14% +0.5 1.49 ± 4% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.08 ± 14% +0.6 1.63 ± 4% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.14 ±173% +0.6 0.71 ± 7% perf-profile.calltrace.cycles-pp.fsnotify.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.80 ± 3% +0.6 3.42 ± 5% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.new_sync_write.vfs_write.ksys_write
0.16 ±173% +0.7 0.81 ± 17% perf-profile.calltrace.cycles-pp.secondary_startup_64
0.00 +0.7 0.68 ± 5% perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.new_sync_write.vfs_write.ksys_write
4.86 ± 6% +0.7 5.58 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.86 +0.9 4.72 ± 2% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.60 ± 10% +0.9 1.47 ± 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__wake_up_common_lock.pipe_write.new_sync_write
1.00 ± 9% +1.0 2.04 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write
6.33 ± 16% +1.9 8.20 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
7.46 ± 14% +2.4 9.85 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
13.21 ± 4% -6.5 6.75 ± 12% perf-profile.children.cycles-pp.pipe_wait
13.30 ± 3% -6.0 7.31 ± 12% perf-profile.children.cycles-pp.__schedule
12.65 ± 3% -5.8 6.83 ± 12% perf-profile.children.cycles-pp.schedule
11.08 ± 5% -5.0 6.05 ± 12% perf-profile.children.cycles-pp.__wake_up_common
10.69 ± 5% -5.0 5.72 ± 13% perf-profile.children.cycles-pp.autoremove_wake_function
10.58 ± 6% -4.8 5.76 ± 13% perf-profile.children.cycles-pp.try_to_wake_up
25.90 ± 2% -4.3 21.65 perf-profile.children.cycles-pp.new_sync_write
24.88 ± 2% -4.1 20.81 perf-profile.children.cycles-pp.pipe_write
13.18 ± 4% -4.0 9.14 ± 7% perf-profile.children.cycles-pp.__wake_up_common_lock
20.69 -3.2 17.49 ± 5% perf-profile.children.cycles-pp.new_sync_read
19.73 -3.2 16.56 ± 5% perf-profile.children.cycles-pp.pipe_read
31.87 ± 2% -2.9 28.96 perf-profile.children.cycles-pp.vfs_write
33.43 ± 2% -2.4 31.00 perf-profile.children.cycles-pp.ksys_write
79.28 -2.2 77.04 perf-profile.children.cycles-pp.do_syscall_64
80.23 -2.1 78.11 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
28.11 -2.1 26.02 ± 3% perf-profile.children.cycles-pp.vfs_read
4.11 ± 6% -1.8 2.29 ± 10% perf-profile.children.cycles-pp.ttwu_do_activate
4.08 ± 6% -1.8 2.27 ± 10% perf-profile.children.cycles-pp.activate_task
29.63 -1.8 27.83 ± 2% perf-profile.children.cycles-pp.ksys_read
3.95 ± 6% -1.8 2.19 ± 10% perf-profile.children.cycles-pp.enqueue_task_fair
3.62 ± 5% -1.6 2.00 ± 12% perf-profile.children.cycles-pp.dequeue_task_fair
2.70 ± 3% -1.4 1.28 ± 22% perf-profile.children.cycles-pp._raw_spin_lock
2.76 ± 5% -1.2 1.52 ± 13% perf-profile.children.cycles-pp.switch_mm_irqs_off
2.41 ± 6% -1.1 1.31 ± 12% perf-profile.children.cycles-pp.select_task_rq_fair
2.56 ± 4% -1.1 1.48 ± 7% perf-profile.children.cycles-pp.pick_next_task_fair
3.58 ± 2% -1.0 2.56 ± 7% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
2.01 ± 7% -0.9 1.11 ± 14% perf-profile.children.cycles-pp.enqueue_entity
1.94 ± 2% -0.9 1.08 ± 14% perf-profile.children.cycles-pp.update_curr
1.90 ± 6% -0.9 1.05 ± 13% perf-profile.children.cycles-pp.select_idle_sibling
0.96 ± 6% -0.7 0.23 ± 12% perf-profile.children.cycles-pp.prepare_to_wait
1.54 ± 10% -0.7 0.81 ± 12% perf-profile.children.cycles-pp.exit_to_usermode_loop
1.58 ± 5% -0.7 0.87 ± 11% perf-profile.children.cycles-pp.dequeue_entity
1.83 ± 8% -0.7 1.12 ± 9% perf-profile.children.cycles-pp.update_load_avg
1.49 ± 4% -0.6 0.86 ± 13% perf-profile.children.cycles-pp.reweight_entity
1.42 ± 7% -0.6 0.80 ± 15% perf-profile.children.cycles-pp.load_new_mm_cr3
0.81 ± 5% -0.4 0.44 ± 13% perf-profile.children.cycles-pp.update_cfs_group
0.90 ± 6% -0.4 0.53 ± 12% perf-profile.children.cycles-pp.set_next_entity
0.76 ± 7% -0.4 0.40 ± 14% perf-profile.children.cycles-pp.__switch_to
0.72 ± 10% -0.4 0.36 ± 8% perf-profile.children.cycles-pp.update_rq_clock
0.88 ± 7% -0.3 0.56 ± 13% perf-profile.children.cycles-pp.__update_load_avg_se
0.72 ± 8% -0.3 0.40 ± 8% perf-profile.children.cycles-pp.__switch_to_asm
0.71 ± 11% -0.3 0.41 ± 12% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.63 ± 2% -0.3 0.35 ± 13% perf-profile.children.cycles-pp.___perf_sw_event
0.63 ± 11% -0.3 0.36 ± 11% perf-profile.children.cycles-pp.check_preempt_curr
0.67 ± 10% -0.3 0.41 ± 11% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.57 ± 6% -0.3 0.31 ± 15% perf-profile.children.cycles-pp.__enqueue_entity
0.57 ± 10% -0.2 0.34 ± 16% perf-profile.children.cycles-pp.native_write_msr
0.52 ± 7% -0.2 0.29 ± 12% perf-profile.children.cycles-pp.pick_next_entity
0.50 ± 12% -0.2 0.28 ± 11% perf-profile.children.cycles-pp.check_preempt_wakeup
0.51 ± 7% -0.2 0.29 ± 16% perf-profile.children.cycles-pp.sched_clock_cpu
0.57 ± 6% -0.2 0.36 ± 17% perf-profile.children.cycles-pp.switch_fpu_return
0.40 ± 12% -0.2 0.20 ± 18% perf-profile.children.cycles-pp.available_idle_cpu
0.42 ± 10% -0.2 0.23 ± 7% perf-profile.children.cycles-pp.account_entity_dequeue
0.43 ± 6% -0.2 0.24 ± 18% perf-profile.children.cycles-pp.sched_clock
0.44 ± 23% -0.2 0.26 ± 42% perf-profile.children.cycles-pp.finish_task_switch
0.61 ± 7% -0.2 0.43 ± 6% perf-profile.children.cycles-pp.preempt_schedule_common
0.43 ± 10% -0.2 0.26 ± 6% perf-profile.children.cycles-pp.__calc_delta
0.39 ± 5% -0.2 0.22 ± 19% perf-profile.children.cycles-pp.native_sched_clock
0.92 ± 12% -0.2 0.75 ± 5% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
1.69 ± 3% -0.2 1.53 ± 4% perf-profile.children.cycles-pp._cond_resched
0.36 ± 13% -0.2 0.20 ± 7% perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.32 ± 14% -0.1 0.17 ± 13% perf-profile.children.cycles-pp.prepare_exit_to_usermode
0.33 ± 3% -0.1 0.19 ± 9% perf-profile.children.cycles-pp.cpumask_next_wrap
0.25 ± 8% -0.1 0.14 ± 16% perf-profile.children.cycles-pp.rb_erase
0.33 ± 6% -0.1 0.21 ± 5% perf-profile.children.cycles-pp.__list_del_entry_valid
0.26 ± 7% -0.1 0.14 ± 32% perf-profile.children.cycles-pp.cpuacct_charge
0.26 ± 15% -0.1 0.15 ± 18% perf-profile.children.cycles-pp.account_entity_enqueue
0.21 ± 6% -0.1 0.10 ± 15% perf-profile.children.cycles-pp.clear_buddies
0.18 ± 29% -0.1 0.07 ± 29% perf-profile.children.cycles-pp.set_next_buddy
0.32 ± 10% -0.1 0.22 ± 9% perf-profile.children.cycles-pp.put_prev_entity
0.34 ± 7% -0.1 0.24 ± 12% perf-profile.children.cycles-pp.reschedule_interrupt
0.26 ± 15% -0.1 0.16 ± 9% perf-profile.children.cycles-pp.find_next_bit
0.25 ± 10% -0.1 0.15 ± 10% perf-profile.children.cycles-pp.update_min_vruntime
0.15 ± 8% -0.1 0.06 ± 17% perf-profile.children.cycles-pp.__list_add_valid
0.17 ± 11% -0.1 0.09 ± 13% perf-profile.children.cycles-pp.rb_insert_color
0.32 ± 4% -0.1 0.25 perf-profile.children.cycles-pp.anon_pipe_buf_release
0.15 ± 11% -0.1 0.08 ± 6% perf-profile.children.cycles-pp.deactivate_task
0.09 ± 13% -0.1 0.03 ±100% perf-profile.children.cycles-pp.generic_update_time
0.14 ± 5% -0.1 0.09 ± 14% perf-profile.children.cycles-pp.finish_wait
0.15 ± 18% -0.1 0.10 ± 18% perf-profile.children.cycles-pp.cpumask_next
0.12 ± 9% -0.0 0.08 ± 27% perf-profile.children.cycles-pp.native_load_tls
0.23 ± 8% +0.1 0.29 ± 4% perf-profile.children.cycles-pp.__sb_end_write
0.37 ± 4% +0.1 0.45 ± 6% perf-profile.children.cycles-pp.__x64_sys_write
0.00 +0.1 0.09 ± 5% perf-profile.children.cycles-pp.osq_lock
0.35 ± 6% +0.1 0.45 ± 11% perf-profile.children.cycles-pp.generic_pipe_buf_confirm
0.44 ± 23% +0.1 0.56 ± 3% perf-profile.children.cycles-pp.timestamp_truncate
0.11 ± 33% +0.1 0.23 ± 9% perf-profile.children.cycles-pp.get_page_from_freelist
0.15 ± 23% +0.1 0.27 ± 10% perf-profile.children.cycles-pp.mutex_spin_on_owner
0.34 ± 12% +0.1 0.48 ± 26% perf-profile.children.cycles-pp.inode_has_perm
0.14 ± 25% +0.2 0.29 ± 10% perf-profile.children.cycles-pp.__alloc_pages_nodemask
0.10 ± 43% +0.2 0.27 ± 25% perf-profile.children.cycles-pp.start_kernel
1.76 ± 3% +0.2 1.94 ± 2% perf-profile.children.cycles-pp.___might_sleep
0.10 ± 27% +0.2 0.28 ± 13% perf-profile.children.cycles-pp.__vfs_write
1.24 ± 6% +0.2 1.45 ± 6% perf-profile.children.cycles-pp.__fsnotify_parent
1.09 ± 7% +0.2 1.31 ± 7% perf-profile.children.cycles-pp.copyin
2.38 ± 5% +0.2 2.60 ± 5% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
2.27 ± 4% +0.2 2.49 ± 4% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.99 ± 2% +0.2 1.24 ± 4% perf-profile.children.cycles-pp.__might_fault
1.17 ± 9% +0.3 1.44 ± 4% perf-profile.children.cycles-pp.__might_sleep
2.90 ± 2% +0.3 3.20 ± 5% perf-profile.children.cycles-pp.mutex_lock
0.50 ± 20% +0.3 0.81 ± 17% perf-profile.children.cycles-pp.secondary_startup_64
0.50 ± 20% +0.3 0.81 ± 17% perf-profile.children.cycles-pp.cpu_startup_entry
0.49 ± 21% +0.3 0.81 ± 17% perf-profile.children.cycles-pp.do_idle
0.24 ± 21% +0.3 0.57 ± 20% perf-profile.children.cycles-pp.intel_idle
0.46 ± 6% +0.3 0.80 ± 5% perf-profile.children.cycles-pp.__mutex_lock
1.10 ± 2% +0.3 1.44 ± 3% perf-profile.children.cycles-pp.mutex_unlock
1.42 ± 4% +0.3 1.77 ± 5% perf-profile.children.cycles-pp.avc_has_perm
0.33 ± 20% +0.3 0.68 ± 18% perf-profile.children.cycles-pp.cpuidle_enter_state
0.33 ± 20% +0.3 0.68 ± 18% perf-profile.children.cycles-pp.cpuidle_enter
3.52 ± 4% +0.4 3.92 perf-profile.children.cycles-pp.copy_page_to_iter
1.65 ± 13% +0.5 2.20 ± 9% perf-profile.children.cycles-pp.fsnotify
4.79 ± 2% +0.6 5.35 ± 4% perf-profile.children.cycles-pp.selinux_file_permission
2.35 ± 5% +0.6 2.92 ± 3% perf-profile.children.cycles-pp.file_has_perm
2.91 ± 3% +0.6 3.49 ± 5% perf-profile.children.cycles-pp.copy_page_from_iter
2.12 ± 6% +0.6 2.73 ± 6% perf-profile.children.cycles-pp.__fget_light
2.29 ± 6% +0.7 2.98 ± 7% perf-profile.children.cycles-pp.__fdget_pos
7.84 +1.3 9.17 perf-profile.children.cycles-pp.syscall_return_via_sysret
8.92 +1.5 10.38 ± 2% perf-profile.children.cycles-pp.security_file_permission
8.34 ± 2% +1.5 9.87 perf-profile.children.cycles-pp.entry_SYSCALL_64
3.57 ± 2% -1.0 2.56 ± 7% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.75 ± 2% -0.8 0.98 ± 12% perf-profile.self.cycles-pp.__schedule
1.34 ± 5% -0.6 0.71 ± 14% perf-profile.self.cycles-pp.switch_mm_irqs_off
1.42 ± 7% -0.6 0.80 ± 15% perf-profile.self.cycles-pp.load_new_mm_cr3
1.03 ± 6% -0.5 0.55 ± 16% perf-profile.self.cycles-pp.update_curr
0.79 ± 5% -0.4 0.43 ± 14% perf-profile.self.cycles-pp.update_cfs_group
0.78 ± 3% -0.4 0.43 ± 11% perf-profile.self.cycles-pp.select_idle_sibling
0.72 ± 7% -0.3 0.38 ± 11% perf-profile.self.cycles-pp.__switch_to
0.85 ± 7% -0.3 0.54 ± 13% perf-profile.self.cycles-pp.__update_load_avg_se
0.70 ± 8% -0.3 0.38 ± 6% perf-profile.self.cycles-pp.__switch_to_asm
0.62 ± 2% -0.3 0.34 ± 18% perf-profile.self.cycles-pp.reweight_entity
0.57 ± 2% -0.3 0.30 ± 16% perf-profile.self.cycles-pp.___perf_sw_event
0.66 ± 9% -0.3 0.40 ± 10% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.56 ± 7% -0.3 0.31 ± 15% perf-profile.self.cycles-pp.__enqueue_entity
0.56 ± 5% -0.2 0.32 ± 5% perf-profile.self.cycles-pp.update_load_avg
0.57 ± 6% -0.2 0.33 ± 8% perf-profile.self.cycles-pp.pick_next_task_fair
0.49 ± 8% -0.2 0.25 ± 11% perf-profile.self.cycles-pp.select_task_rq_fair
0.57 ± 10% -0.2 0.33 ± 14% perf-profile.self.cycles-pp.native_write_msr
0.40 ± 16% -0.2 0.18 ± 21% perf-profile.self.cycles-pp.pipe_wait
0.41 ± 15% -0.2 0.20 ± 7% perf-profile.self.cycles-pp.update_rq_clock
0.40 ± 13% -0.2 0.20 ± 21% perf-profile.self.cycles-pp.available_idle_cpu
0.55 ± 7% -0.2 0.35 ± 18% perf-profile.self.cycles-pp.switch_fpu_return
0.41 ± 3% -0.2 0.22 ± 8% perf-profile.self.cycles-pp.enqueue_task_fair
0.38 ± 5% -0.2 0.21 ± 18% perf-profile.self.cycles-pp.native_sched_clock
0.43 ± 10% -0.2 0.26 ± 6% perf-profile.self.cycles-pp.__calc_delta
0.42 ± 9% -0.2 0.27 ± 11% perf-profile.self.cycles-pp.dequeue_task_fair
0.43 ± 8% -0.1 0.28 ± 14% perf-profile.self.cycles-pp.try_to_wake_up
0.81 ± 6% -0.1 0.68 ± 5% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.30 ± 15% -0.1 0.17 ± 4% perf-profile.self.cycles-pp.account_entity_dequeue
0.33 ± 6% -0.1 0.21 ± 5% perf-profile.self.cycles-pp.__list_del_entry_valid
0.28 ± 23% -0.1 0.16 ± 14% perf-profile.self.cycles-pp.enqueue_entity
0.37 ± 9% -0.1 0.25 ± 9% perf-profile.self.cycles-pp._raw_spin_lock
0.20 ± 9% -0.1 0.09 ± 24% perf-profile.self.cycles-pp.prepare_to_wait
0.24 ± 18% -0.1 0.12 ± 17% perf-profile.self.cycles-pp.check_preempt_wakeup
0.25 ± 5% -0.1 0.14 ± 15% perf-profile.self.cycles-pp.pick_next_entity
0.25 ± 7% -0.1 0.14 ± 32% perf-profile.self.cycles-pp.cpuacct_charge
0.24 ± 9% -0.1 0.13 ± 14% perf-profile.self.cycles-pp.rb_erase
0.17 ± 33% -0.1 0.07 ± 28% perf-profile.self.cycles-pp.set_next_buddy
0.18 ± 12% -0.1 0.08 ± 17% perf-profile.self.cycles-pp.dequeue_entity
0.19 ± 9% -0.1 0.10 ± 17% perf-profile.self.cycles-pp.clear_buddies
0.18 ± 2% -0.1 0.08 ± 10% perf-profile.self.cycles-pp.schedule
0.24 ± 15% -0.1 0.15 ± 8% perf-profile.self.cycles-pp.find_next_bit
0.22 ± 11% -0.1 0.13 ± 19% perf-profile.self.cycles-pp.account_entity_enqueue
0.17 ± 4% -0.1 0.09 ± 19% perf-profile.self.cycles-pp.set_next_entity
0.17 ± 10% -0.1 0.08 ± 15% perf-profile.self.cycles-pp.rb_insert_color
0.23 ± 11% -0.1 0.15 ± 10% perf-profile.self.cycles-pp.update_min_vruntime
0.13 ± 13% -0.1 0.05 ± 58% perf-profile.self.cycles-pp.autoremove_wake_function
0.19 ± 3% -0.1 0.11 ± 13% perf-profile.self.cycles-pp.cpumask_next_wrap
0.14 ± 10% -0.1 0.06 ± 20% perf-profile.self.cycles-pp.__list_add_valid
0.22 ± 13% -0.1 0.14 ± 6% perf-profile.self.cycles-pp.finish_task_switch
0.31 ± 5% -0.1 0.24 ± 2% perf-profile.self.cycles-pp.anon_pipe_buf_release
0.14 ± 13% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.deactivate_task
0.39 ± 11% -0.1 0.33 ± 4% perf-profile.self.cycles-pp.__wake_up_common
0.98 ± 2% -0.1 0.93 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.70 ± 2% -0.1 0.65 ± 5% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.12 ± 9% -0.1 0.06 ± 59% perf-profile.self.cycles-pp.activate_task
0.13 ± 16% -0.0 0.08 ± 15% perf-profile.self.cycles-pp.check_preempt_curr
0.12 ± 9% -0.0 0.08 ± 24% perf-profile.self.cycles-pp.native_load_tls
0.11 ± 11% -0.0 0.08 ± 19% perf-profile.self.cycles-pp.finish_wait
0.13 ± 12% -0.0 0.10 ± 14% perf-profile.self.cycles-pp.rb_next
0.22 ± 7% +0.1 0.28 ± 3% perf-profile.self.cycles-pp.__sb_end_write
0.32 ± 2% +0.1 0.39 ± 6% perf-profile.self.cycles-pp.__x64_sys_write
0.33 +0.1 0.41 ± 7% perf-profile.self.cycles-pp.__wake_up_common_lock
0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.osq_lock
0.04 ±107% +0.1 0.12 ± 12% perf-profile.self.cycles-pp.get_page_from_freelist
0.41 ± 23% +0.1 0.51 ± 2% perf-profile.self.cycles-pp.timestamp_truncate
0.31 ± 8% +0.1 0.41 ± 12% perf-profile.self.cycles-pp.generic_pipe_buf_confirm
0.15 ± 21% +0.1 0.27 ± 9% perf-profile.self.cycles-pp.mutex_spin_on_owner
0.15 ± 13% +0.1 0.29 perf-profile.self.cycles-pp.__mutex_lock
0.97 ± 4% +0.2 1.12 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.88 ± 9% +0.2 1.03 ± 4% perf-profile.self.cycles-pp.copy_page_from_iter
1.72 ± 3% +0.2 1.89 ± 2% perf-profile.self.cycles-pp.___might_sleep
1.02 ± 2% +0.2 1.19 ± 2% perf-profile.self.cycles-pp.copy_page_to_iter
0.09 ± 28% +0.2 0.27 ± 13% perf-profile.self.cycles-pp.__vfs_write
1.15 ± 5% +0.2 1.39 ± 6% perf-profile.self.cycles-pp.__fsnotify_parent
1.34 ± 2% +0.2 1.58 ± 4% perf-profile.self.cycles-pp.security_file_permission
1.07 ± 8% +0.3 1.32 ± 4% perf-profile.self.cycles-pp.__might_sleep
1.46 ± 2% +0.3 1.73 ± 8% perf-profile.self.cycles-pp.mutex_lock
2.29 ± 4% +0.3 2.56 ± 5% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.24 ± 21% +0.3 0.57 ± 20% perf-profile.self.cycles-pp.intel_idle
1.06 ± 2% +0.3 1.40 ± 3% perf-profile.self.cycles-pp.mutex_unlock
1.39 ± 3% +0.4 1.75 ± 4% perf-profile.self.cycles-pp.avc_has_perm
1.61 ± 12% +0.5 2.12 ± 6% perf-profile.self.cycles-pp.fsnotify
1.57 ± 2% +0.5 2.10 ± 8% perf-profile.self.cycles-pp.pipe_write
2.05 ± 7% +0.6 2.65 ± 6% perf-profile.self.cycles-pp.__fget_light
3.05 ± 5% +0.7 3.75 ± 6% perf-profile.self.cycles-pp.selinux_file_permission
7.82 +1.3 9.15 perf-profile.self.cycles-pp.syscall_return_via_sysret
8.26 ± 2% +1.6 9.87 perf-profile.self.cycles-pp.entry_SYSCALL_64
12.83 ± 3% +2.6 15.45 ± 2% perf-profile.self.cycles-pp.do_syscall_64
hackbench.throughput
15000 +-+--------O--------------------------------------------------------+
| O |
14500 O-+ O O O O O OO O |
| O O O OO |
| O O O O O O O |
14000 +-+ O O |
| |
13500 +-+ |
| |
13000 +-+ |
| |
|.+.+. .+. .+.+. .+. +.+. .+. .+. +.+. |
12500 +-+ +.++ +.+.+. .+.+ .+ +.+.+ + +.+.+ + + +.+.|
| +.+ + |
12000 +-+-----------------------------------------------------------------+
hackbench.time.voluntary_context_switches
2.1e+08 +-+---------------------------------------------------------------+
2e+08 +-+. +. +. + +. + +. .+.++ |
|+ +.++. + +. .+. : +.+.+. + :+ +.+.+. :+ + + + +|
1.9e+08 +-+ + +.++ +.+.+.+ + + + + + |
1.8e+08 +-+ |
1.7e+08 +-+ |
1.6e+08 +-+ |
| |
1.5e+08 +-+ |
1.4e+08 +-+ |
1.3e+08 +-+ |
1.2e+08 +-+ |
| O O O O OO O O O O OO O O O O |
1.1e+08 O-+ OO O O O O O O |
1e+08 +-+---------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-rc5-00035-g975832d6ecbe1" of type "text/plain" (200562 bytes)
View attachment "job-script" of type "text/plain" (7509 bytes)
View attachment "job.yaml" of type "text/plain" (5084 bytes)
View attachment "reproduce" of type "text/plain" (1698 bytes)
Powered by blists - more mailing lists