[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191127003907.GA20422@shao2-debian>
Date: Wed, 27 Nov 2019 08:39:07 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...ts.01.org
Subject: [sched/core] 5d7d605642: will-it-scale.per_thread_ops 2.0%
improvement
Greeting,
FYI, we noticed a 2.0% improvement of will-it-scale.per_thread_ops due to commit:
commit: 5d7d605642b28a5911198a405a6072f091bfbee6 ("sched/core: Optimize pick_next_task()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:
nr_task: 16
mode: thread
test: sched_yield
cpufreq_governor: performance
ucode: 0xb000038
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/thread/16/debian-x86_64-2019-11-14.cgz/lkp-bdw-ep6/sched_yield/will-it-scale/0xb000038
commit:
f488e1057b ("sched/core: Make pick_next_task_idle() more consistent")
5d7d605642 ("sched/core: Optimize pick_next_task()")
f488e1057bb97b88 5d7d605642b28a5911198a405a6
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
1221618 +2.0% 1246050 will-it-scale.per_thread_ops
1649 -4.3% 1579 will-it-scale.time.system_time
3166 +2.2% 3237 will-it-scale.time.user_time
19545896 +2.0% 19936814 will-it-scale.workload
11.00 +9.1% 12.00 vmstat.cpu.us
109124 ± 20% -48.9% 55733 ± 53% numa-meminfo.node0.Active
109082 ± 20% -48.9% 55686 ± 53% numa-meminfo.node0.Active(anon)
14371 ± 31% -44.7% 7952 ± 44% numa-meminfo.node0.Shmem
159384 ± 14% +33.6% 212862 ± 13% numa-meminfo.node1.Active
159262 ± 14% +33.6% 212744 ± 13% numa-meminfo.node1.Active(anon)
27266 ± 20% -48.9% 13928 ± 53% numa-vmstat.node0.nr_active_anon
3594 ± 31% -44.7% 1987 ± 44% numa-vmstat.node0.nr_shmem
27266 ± 20% -48.9% 13928 ± 53% numa-vmstat.node0.nr_zone_active_anon
39815 ± 14% +33.5% 53173 ± 13% numa-vmstat.node1.nr_active_anon
39815 ± 14% +33.5% 53173 ± 13% numa-vmstat.node1.nr_zone_active_anon
1.78 -0.5 1.29 ± 8% perf-stat.i.branch-miss-rate%
70791860 -29.6% 49835379 ± 3% perf-stat.i.branch-misses
0.37 +1.6% 0.38 perf-stat.i.ipc
1.77 -0.5 1.24 ± 4% perf-stat.overall.branch-miss-rate%
0.42 +0.0 0.43 perf-stat.overall.dTLB-store-miss-rate%
0.37 +1.9% 0.38 perf-stat.overall.ipc
70559062 -29.6% 49670019 ± 3% perf-stat.ps.branch-misses
5.707e+12 +1.3% 5.781e+12 perf-stat.total.instructions
5.98 ±173% +418.0% 30.97 ± 24% sched_debug.cfs_rq:/.MIN_vruntime.avg
496.29 ±173% +382.3% 2393 ± 19% sched_debug.cfs_rq:/.MIN_vruntime.max
54.15 ±173% +399.4% 270.39 ± 21% sched_debug.cfs_rq:/.MIN_vruntime.stddev
5.98 ±173% +418.6% 31.01 ± 24% sched_debug.cfs_rq:/.max_vruntime.avg
496.29 ±173% +382.9% 2396 ± 19% sched_debug.cfs_rq:/.max_vruntime.max
54.15 ±173% +400.0% 270.73 ± 21% sched_debug.cfs_rq:/.max_vruntime.stddev
0.97 ± 97% +109.6% 2.03 ± 45% sched_debug.cfs_rq:/.removed.util_avg.avg
47.79 ± 65% +80.6% 86.33 sched_debug.cfs_rq:/.removed.util_avg.max
13822 ± 4% +40.7% 19442 ± 41% softirqs.CPU15.RCU
16567 ± 9% +59.4% 26408 ± 43% softirqs.CPU20.RCU
17248 ± 2% +41.4% 24388 ± 41% softirqs.CPU24.RCU
16620 ± 4% +45.6% 24199 ± 42% softirqs.CPU29.RCU
19099 ± 2% +42.6% 27226 ± 38% softirqs.CPU34.RCU
13569 +37.6% 18671 ± 40% softirqs.CPU44.RCU
29886 ± 20% +53.1% 45752 ± 41% softirqs.CPU52.RCU
32893 +36.9% 45040 ± 40% softirqs.CPU57.RCU
16755 ± 6% +45.4% 24356 ± 44% softirqs.CPU60.RCU
17657 ± 2% +42.7% 25188 ± 42% softirqs.CPU65.RCU
17628 ± 5% +44.9% 25546 ± 40% softirqs.CPU67.RCU
17892 ± 2% +48.2% 26515 ± 42% softirqs.CPU68.RCU
18032 ± 6% +40.4% 25309 ± 41% softirqs.CPU69.RCU
17796 ± 3% +43.6% 25552 ± 43% softirqs.CPU71.RCU
22725 ± 7% +48.6% 33771 ± 38% softirqs.CPU8.RCU
21034 ± 5% +40.6% 29581 ± 37% softirqs.CPU80.RCU
12.48 ± 8% -1.8 10.64 ± 5% perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
9.51 ± 8% -1.7 7.80 ± 5% perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
9.19 ± 8% -1.7 7.49 ± 4% perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.34 ± 8% -1.5 3.86 ± 4% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
12.54 ± 8% -1.8 10.70 ± 5% perf-profile.children.cycles-pp.__x64_sys_sched_yield
9.51 ± 8% -1.7 7.80 ± 5% perf-profile.children.cycles-pp.schedule
9.28 ± 8% -1.7 7.59 ± 4% perf-profile.children.cycles-pp.__schedule
5.44 ± 8% -1.5 3.97 ± 4% perf-profile.children.cycles-pp.pick_next_task_fair
0.66 ± 8% -0.1 0.56 ± 7% perf-profile.children.cycles-pp.__calc_delta
0.06 ± 6% +0.1 0.12 ± 38% perf-profile.children.cycles-pp.update_blocked_averages
0.07 ± 7% +0.1 0.12 ± 37% perf-profile.children.cycles-pp.run_rebalance_domains
0.11 ± 17% +0.1 0.20 ± 39% perf-profile.children.cycles-pp.rebalance_domains
0.30 ± 9% +0.3 0.55 ± 34% perf-profile.children.cycles-pp.__softirqentry_text_start
0.43 ± 8% +0.3 0.72 ± 28% perf-profile.children.cycles-pp.irq_exit
1.67 ± 9% -1.0 0.71 ± 6% perf-profile.self.cycles-pp.pick_next_task_fair
0.63 ± 8% -0.1 0.54 ± 7% perf-profile.self.cycles-pp.__calc_delta
0.06 ± 15% +0.0 0.09 ± 39% perf-profile.self.cycles-pp.run_timer_softirq
185475 ± 59% -68.5% 58457 ±172% interrupts.CPU1.RES:Rescheduling_interrupts
3464 ± 9% -9.0% 3151 interrupts.CPU35.CAL:Function_call_interrupts
618.50 ± 21% +22.3% 756.25 ± 5% interrupts.CPU37.NMI:Non-maskable_interrupts
618.50 ± 21% +22.3% 756.25 ± 5% interrupts.CPU37.PMI:Performance_monitoring_interrupts
2.25 ±110% +47533.3% 1071 ±164% interrupts.CPU42.RES:Rescheduling_interrupts
7906 -31.0% 5456 ± 30% interrupts.CPU50.NMI:Non-maskable_interrupts
7906 -31.0% 5456 ± 30% interrupts.CPU50.PMI:Performance_monitoring_interrupts
488.50 ± 36% +49.8% 731.75 ± 5% interrupts.CPU67.NMI:Non-maskable_interrupts
488.50 ± 36% +49.8% 731.75 ± 5% interrupts.CPU67.PMI:Performance_monitoring_interrupts
585.75 ± 19% +27.4% 746.50 ± 6% interrupts.CPU70.NMI:Non-maskable_interrupts
585.75 ± 19% +27.4% 746.50 ± 6% interrupts.CPU70.PMI:Performance_monitoring_interrupts
577.50 ± 23% +24.3% 717.75 ± 6% interrupts.CPU72.NMI:Non-maskable_interrupts
577.50 ± 23% +24.3% 717.75 ± 6% interrupts.CPU72.PMI:Performance_monitoring_interrupts
529.25 ± 22% +24.2% 657.50 ± 5% interrupts.CPU74.NMI:Non-maskable_interrupts
529.25 ± 22% +24.2% 657.50 ± 5% interrupts.CPU74.PMI:Performance_monitoring_interrupts
516.50 ± 21% +29.9% 671.00 ± 8% interrupts.CPU75.NMI:Non-maskable_interrupts
516.50 ± 21% +29.9% 671.00 ± 8% interrupts.CPU75.PMI:Performance_monitoring_interrupts
3.25 ±125% +51438.5% 1675 ±170% interrupts.CPU78.RES:Rescheduling_interrupts
613.00 ± 22% +25.4% 768.75 ± 7% interrupts.CPU81.NMI:Non-maskable_interrupts
613.00 ± 22% +25.4% 768.75 ± 7% interrupts.CPU81.PMI:Performance_monitoring_interrupts
605.25 ± 22% +25.2% 757.75 ± 7% interrupts.CPU82.NMI:Non-maskable_interrupts
605.25 ± 22% +25.2% 757.75 ± 7% interrupts.CPU82.PMI:Performance_monitoring_interrupts
585.25 ± 19% +20.7% 706.50 ± 4% interrupts.CPU83.NMI:Non-maskable_interrupts
585.25 ± 19% +20.7% 706.50 ± 4% interrupts.CPU83.PMI:Performance_monitoring_interrupts
will-it-scale.per_thread_ops
1.26e+06 +-+-------------------------------------------------------------+
| O |
1.255e+06 O-+O O O O O O O O O O O O |
1.25e+06 +-+ O O O O O O |
| O O O O |
1.245e+06 +-+ O
1.24e+06 +-+ |
| |
1.235e+06 +-+ |
1.23e+06 +-+ |
| |
1.225e+06 +-++ .+.+.. +..+.. |
1.22e+06 +-+ + .+..+. .+. +..+.+..+..+.+.. + +.+..+..+ |
| +. +. +..+ |
1.215e+06 +-+-------------------------------------------------------------+
will-it-scale.workload
2.02e+07 +-+--------------------------------------------------------------+
| |
2.01e+07 +-+ O O O |
O O O O O O O O O O O O O |
2e+07 +-+ O O O O O O |
1.99e+07 +-+ O O O
| |
1.98e+07 +-+ |
| |
1.97e+07 +-+ |
1.96e+07 +-+ |
| .+ +..+.. .+..+. .+.. .+. .+.. .+..+.+..+.. |
1.95e+07 +-+ + .. +.+. +. +. +. +.+. +..+ |
| + |
1.94e+07 +-+--------------------------------------------------------------+
will-it-scale.time.user_time
3260 +-+-----------O--------------------O-O-------------------------------+
O O O O O O O O O O O O O O O |
3240 +-+ O O O O O O |
| O
3220 +-+ |
| |
3200 +-+ |
| |
3180 +-+ +.. +..+.. +..+.. |
| + +.+.. .. +..+.. + +.. + +.. .+..+ |
3160 +-+ + + .. + .. +.. + + |
|: + + + + |
3140 +-+ + |
| : + |
3120 +-+------------------------------------------------------------------+
will-it-scale.time.system_time
1700 +-+------------------------------------------------------------------+
| +.. |
1680 +-+ |
1660 +-+ +.. .+.. +.. .+ |
| .+..+.. .+. + .+. + .+..+.+.. |
1640 +-+ +..+ .+..+. + +. + .+. + |
| +. +. |
1620 +-+ |
| |
1600 +-+ |
1580 +-+ O O
| O O O O O O O |
1560 O-+O O O O O O O O O O O O O O O |
| |
1540 +-+------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-rc7-00033-g5d7d605642b28" of type "text/plain" (200562 bytes)
View attachment "job-script" of type "text/plain" (7853 bytes)
View attachment "job.yaml" of type "text/plain" (5498 bytes)
View attachment "reproduce" of type "text/plain" (314 bytes)
Powered by blists - more mailing lists