[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200827012727.GN4299@shao2-debian>
Date: Thu, 27 Aug 2020 09:27:28 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ingo Molnar <mingo@...nel.org>, Rik van Riel <riel@...riel.com>,
Ben Segall <bsegall@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>, Mike Galbraith <efault@....de>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com, aubrey.li@...ux.intel.com,
yu.c.chen@...el.com
Subject: [sched/fair] fcf0553db6: vm-scalability.median 4.6% improvement
Greeting,
FYI, we noticed a 1.5% improvement of vm-scalability.throughput due to commit:
commit: fcf0553db6f4c79387864f6e4ab4a891601f395e ("sched/fair: Remove meaningless imbalance calculation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
runtime: 300s
size: 8T
test: anon-w-seq
cpufreq_governor: performance
ucode: 0x16
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
In addition to that, the commit also has significant impact on the following tests:
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-hsw-4ex1/anon-w-seq/vm-scalability/0x16
commit:
a349834703 ("sched/fair: Rename sg_lb_stats::sum_nr_running to sum_h_nr_running")
fcf0553db6 ("sched/fair: Remove meaningless imbalance calculation")
a349834703010183 fcf0553db6f4c79387864f6e4ab
---------------- ---------------------------
%stddev %change %stddev
\ | \
595857 +4.6% 623549 vm-scalability.median
8.22 ± 7% -3.2 5.01 ± 5% vm-scalability.median_stddev%
88608356 +1.5% 89976003 vm-scalability.throughput
498624 ± 2% -9.5% 451343 vm-scalability.time.involuntary_context_switches
9335 -1.3% 9212 vm-scalability.time.percent_of_cpu_this_job_got
12408855 ± 9% -17.1% 10291775 ± 8% meminfo.DirectMap2M
20311 ± 40% +60.3% 32564 ± 16% numa-numastat.node0.other_node
1759 ± 8% +62.0% 2850 ± 6% syscalls.sys_read.med
6827 ± 2% -2.8% 6636 vmstat.system.cs
12092 ± 18% -43.8% 6792 ± 30% numa-vmstat.node0.nr_slab_reclaimable
6233 ± 26% +29.4% 8069 ± 23% numa-vmstat.node3.nr_slab_reclaimable
14990 ± 12% +45.0% 21732 ± 18% numa-vmstat.node3.nr_slab_unreclaimable
48371 ± 18% -43.8% 27169 ± 30% numa-meminfo.node0.KReclaimable
48371 ± 18% -43.8% 27169 ± 30% numa-meminfo.node0.SReclaimable
167341 ± 16% -32.6% 112749 ± 31% numa-meminfo.node0.Slab
24931 ± 26% +29.5% 32277 ± 23% numa-meminfo.node3.KReclaimable
24931 ± 26% +29.5% 32277 ± 23% numa-meminfo.node3.SReclaimable
59961 ± 12% +45.0% 86933 ± 18% numa-meminfo.node3.SUnreclaim
84893 ± 15% +40.4% 119211 ± 13% numa-meminfo.node3.Slab
4627 ± 9% +25.4% 5802 ± 4% slabinfo.eventpoll_pwq.active_objs
4627 ± 9% +25.4% 5802 ± 4% slabinfo.eventpoll_pwq.num_objs
2190 ± 14% +26.1% 2762 ± 4% slabinfo.kmem_cache.active_objs
2190 ± 14% +26.1% 2762 ± 4% slabinfo.kmem_cache.num_objs
5823 ± 12% +22.2% 7118 ± 4% slabinfo.kmem_cache_node.active_objs
9665 ± 5% +10.3% 10663 ± 3% slabinfo.shmem_inode_cache.active_objs
9775 ± 5% +10.1% 10762 ± 3% slabinfo.shmem_inode_cache.num_objs
322.79 -7.9% 297.22 perf-stat.i.cpu-migrations
21.89 ± 3% -1.2 20.69 perf-stat.i.node-load-miss-rate%
13112232 ± 3% +3.4% 13562841 perf-stat.i.node-loads
21.44 ± 2% -1.1 20.29 perf-stat.overall.node-load-miss-rate%
3160 +1.9% 3219 perf-stat.overall.path-length
6748 ± 2% -2.6% 6570 perf-stat.ps.context-switches
319.09 -8.2% 292.97 perf-stat.ps.cpu-migrations
12905267 ± 3% +3.4% 13345546 perf-stat.ps.node-loads
12193 ± 5% +8.6% 13239 ± 4% softirqs.CPU105.RCU
12791 ± 6% +12.4% 14382 ± 7% softirqs.CPU22.RCU
12781 ± 5% +12.3% 14351 ± 4% softirqs.CPU28.RCU
12870 ± 4% +9.0% 14033 ± 7% softirqs.CPU31.RCU
11451 ± 5% +11.7% 12785 ± 4% softirqs.CPU90.RCU
11449 ± 6% +9.1% 12497 ± 6% softirqs.CPU91.RCU
11486 ± 6% +8.9% 12510 softirqs.CPU93.RCU
12197 ± 5% +10.4% 13462 ± 5% softirqs.CPU97.RCU
3899 ±100% -78.7% 830.89 ± 67% sched_debug.cfs_rq:/.load_avg.max
386.54 ± 91% -76.1% 92.43 ± 61% sched_debug.cfs_rq:/.load_avg.stddev
747.05 ± 7% -26.9% 546.12 ± 31% sched_debug.cfs_rq:/.runnable_load_avg.max
-203275 +540.7% -1302453 sched_debug.cfs_rq:/.spread0.avg
1142 ± 5% +9.6% 1251 ± 6% sched_debug.cfs_rq:/.util_avg.max
1.00 ±163% +2117.5% 22.17 ±100% sched_debug.cfs_rq:/.util_avg.min
212.47 ± 14% -16.9% 176.57 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.stddev
0.00 ± 7% +226.7% 0.00 ± 18% sched_debug.cpu.next_balance.stddev
2.48 ± 20% -24.6% 1.87 ± 11% sched_debug.cpu.nr_running.max
0.64 ± 29% -49.1% 0.33 ± 11% sched_debug.cpu.nr_running.stddev
137.50 ± 54% +194.2% 404.50 ± 78% interrupts.CPU109.RES:Rescheduling_interrupts
822.50 ± 22% -39.4% 498.75 ± 49% interrupts.CPU114.CAL:Function_call_interrupts
126.00 ± 72% +161.1% 329.00 ± 48% interrupts.CPU117.RES:Rescheduling_interrupts
153.00 ± 69% +121.6% 339.00 ± 51% interrupts.CPU119.RES:Rescheduling_interrupts
162.75 ± 83% +170.8% 440.75 ± 36% interrupts.CPU121.RES:Rescheduling_interrupts
798.00 ± 30% -45.0% 438.75 ± 28% interrupts.CPU126.CAL:Function_call_interrupts
3831 ± 37% -96.4% 138.50 ±106% interrupts.CPU128.NMI:Non-maskable_interrupts
3831 ± 37% -96.4% 138.50 ±106% interrupts.CPU128.PMI:Performance_monitoring_interrupts
218.25 ± 44% -51.7% 105.50 ± 4% interrupts.CPU128.RES:Rescheduling_interrupts
453.00 ± 74% -74.3% 116.50 ± 26% interrupts.CPU130.RES:Rescheduling_interrupts
2382 ± 15% -96.6% 80.75 ±133% interrupts.CPU138.NMI:Non-maskable_interrupts
2382 ± 15% -96.6% 80.75 ±133% interrupts.CPU138.PMI:Performance_monitoring_interrupts
557.25 ±155% +341.2% 2458 ± 79% interrupts.CPU14.NMI:Non-maskable_interrupts
557.25 ±155% +341.2% 2458 ± 79% interrupts.CPU14.PMI:Performance_monitoring_interrupts
782.00 ± 31% -37.0% 492.50 ± 50% interrupts.CPU17.CAL:Function_call_interrupts
779.75 ± 31% -36.8% 492.50 ± 50% interrupts.CPU18.CAL:Function_call_interrupts
782.25 ± 31% -37.2% 491.00 ± 49% interrupts.CPU20.CAL:Function_call_interrupts
5855 ± 20% -75.7% 1420 ±106% interrupts.CPU23.NMI:Non-maskable_interrupts
5855 ± 20% -75.7% 1420 ±106% interrupts.CPU23.PMI:Performance_monitoring_interrupts
785.00 ± 30% -38.0% 487.00 ± 51% interrupts.CPU3.CAL:Function_call_interrupts
1266 ±166% +241.5% 4325 ± 38% interrupts.CPU34.NMI:Non-maskable_interrupts
1266 ±166% +241.5% 4325 ± 38% interrupts.CPU34.PMI:Performance_monitoring_interrupts
784.00 ± 30% -37.0% 493.75 ± 49% interrupts.CPU36.CAL:Function_call_interrupts
783.25 ± 31% -37.6% 489.00 ± 51% interrupts.CPU37.CAL:Function_call_interrupts
325.75 ± 11% +80.6% 588.25 ± 45% interrupts.CPU37.RES:Rescheduling_interrupts
782.00 ± 31% -37.0% 492.50 ± 50% interrupts.CPU39.CAL:Function_call_interrupts
781.50 ± 31% -38.0% 484.50 ± 52% interrupts.CPU41.CAL:Function_call_interrupts
292.50 ± 21% +59.2% 465.75 ± 26% interrupts.CPU43.RES:Rescheduling_interrupts
233.00 ± 12% +59.4% 371.50 ± 20% interrupts.CPU48.RES:Rescheduling_interrupts
883.00 ±160% -97.1% 25.25 ±150% interrupts.CPU50.NMI:Non-maskable_interrupts
883.00 ±160% -97.1% 25.25 ±150% interrupts.CPU50.PMI:Performance_monitoring_interrupts
265.25 ± 28% +69.8% 450.50 ± 27% interrupts.CPU50.RES:Rescheduling_interrupts
249.75 ± 19% +40.8% 351.75 ± 11% interrupts.CPU51.RES:Rescheduling_interrupts
785.50 ± 30% -37.2% 493.00 ± 49% interrupts.CPU53.CAL:Function_call_interrupts
1.75 ±116% +1.2e+05% 2020 ±108% interrupts.CPU53.NMI:Non-maskable_interrupts
1.75 ±116% +1.2e+05% 2020 ±108% interrupts.CPU53.PMI:Performance_monitoring_interrupts
826.75 ± 79% +429.5% 4377 ± 27% interrupts.CPU56.NMI:Non-maskable_interrupts
826.75 ± 79% +429.5% 4377 ± 27% interrupts.CPU56.PMI:Performance_monitoring_interrupts
782.50 ± 31% -37.6% 488.00 ± 51% interrupts.CPU57.CAL:Function_call_interrupts
32.25 ±164% -100.0% 0.00 interrupts.CPU57.TLB:TLB_shootdowns
782.50 ± 31% -36.9% 494.00 ± 50% interrupts.CPU6.CAL:Function_call_interrupts
781.00 ± 31% -36.8% 493.50 ± 50% interrupts.CPU61.CAL:Function_call_interrupts
658.75 ±169% +429.9% 3490 ± 43% interrupts.CPU61.NMI:Non-maskable_interrupts
658.75 ±169% +429.9% 3490 ± 43% interrupts.CPU61.PMI:Performance_monitoring_interrupts
405.75 ± 20% -35.6% 261.50 ± 17% interrupts.CPU62.RES:Rescheduling_interrupts
781.75 ± 31% -36.8% 494.00 ± 50% interrupts.CPU65.CAL:Function_call_interrupts
782.50 ± 31% -37.0% 492.75 ± 50% interrupts.CPU66.CAL:Function_call_interrupts
838.25 ±121% +513.8% 5145 ± 30% interrupts.CPU66.NMI:Non-maskable_interrupts
838.25 ±121% +513.8% 5145 ± 30% interrupts.CPU66.PMI:Performance_monitoring_interrupts
782.00 ± 31% -37.1% 492.25 ± 50% interrupts.CPU67.CAL:Function_call_interrupts
782.25 ± 31% -37.0% 492.75 ± 49% interrupts.CPU69.CAL:Function_call_interrupts
780.00 ± 31% -37.3% 489.25 ± 49% interrupts.CPU71.CAL:Function_call_interrupts
784.25 ± 31% -38.0% 486.50 ± 51% interrupts.CPU75.CAL:Function_call_interrupts
783.25 ± 31% -36.9% 494.25 ± 50% interrupts.CPU77.CAL:Function_call_interrupts
822.00 ± 33% -39.9% 494.25 ± 50% interrupts.CPU83.CAL:Function_call_interrupts
789.75 ± 29% -37.1% 496.50 ± 49% interrupts.CPU90.CAL:Function_call_interrupts
5966 ± 15% -53.5% 2777 ± 62% interrupts.CPU94.NMI:Non-maskable_interrupts
5966 ± 15% -53.5% 2777 ± 62% interrupts.CPU94.PMI:Performance_monitoring_interrupts
46379 ± 9% -10.1% 41674 ± 4% interrupts.RES:Rescheduling_interrupts
68.78 -10.0 58.75 ± 8% perf-profile.calltrace.cycles-pp.do_access
30.04 ± 4% -4.3 25.69 ± 9% perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.do_access
30.06 ± 4% -4.3 25.71 ± 9% perf-profile.calltrace.cycles-pp.page_fault.do_access
30.00 ± 4% -4.3 25.66 ± 9% perf-profile.calltrace.cycles-pp.do_user_addr_fault.do_page_fault.page_fault.do_access
29.93 ± 4% -4.3 25.59 ± 9% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.do_page_fault.page_fault.do_access
26.63 ± 2% -2.7 23.88 ± 6% perf-profile.calltrace.cycles-pp.do_rw_once
0.57 ± 5% -0.3 0.27 ±100% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_huge_pmd_anonymous_page
0.63 ± 6% -0.2 0.43 ± 58% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
0.63 ± 6% -0.2 0.43 ± 58% perf-profile.calltrace.cycles-pp.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.62 ± 5% -0.2 0.42 ± 58% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault
28.25 ± 5% +6.6 34.88 ± 6% perf-profile.calltrace.cycles-pp.clear_page_erms.clear_subpage.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault
29.30 ± 5% +6.7 36.03 ± 6% perf-profile.calltrace.cycles-pp.clear_subpage.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
30.21 ± 5% +7.0 37.23 ± 6% perf-profile.calltrace.cycles-pp.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
31.31 ± 5% +7.3 38.61 ± 6% perf-profile.calltrace.cycles-pp.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.do_page_fault
31.39 ± 5% +7.3 38.70 ± 6% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.do_page_fault.page_fault
1.50 ±106% +11.7 13.17 ± 35% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.do_page_fault.page_fault
1.50 ±106% +11.7 13.20 ± 35% perf-profile.calltrace.cycles-pp.do_user_addr_fault.do_page_fault.page_fault
1.51 ±106% +11.7 13.21 ± 35% perf-profile.calltrace.cycles-pp.page_fault
1.50 ±106% +11.7 13.21 ± 35% perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
65.44 -9.4 56.04 ± 8% perf-profile.children.cycles-pp.do_access
30.08 -3.4 26.70 ± 7% perf-profile.children.cycles-pp.do_rw_once
0.06 ± 11% +0.0 0.08 ± 10% perf-profile.children.cycles-pp.pagevec_lru_move_fn
0.14 ± 6% +0.0 0.16 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.13 ± 10% +0.0 0.16 ± 9% perf-profile.children.cycles-pp.pte_alloc_one
0.12 ± 12% +0.0 0.15 ± 8% perf-profile.children.cycles-pp.prep_new_page
0.12 ± 10% +0.0 0.16 ± 20% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.12 ± 9% +0.0 0.15 ± 10% perf-profile.children.cycles-pp.prepare_exit_to_usermode
0.14 ± 8% +0.0 0.19 ± 9% perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.irq_work_interrupt
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.smp_irq_work_interrupt
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.irq_work_run
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.printk
0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.irq_work_run_list
0.00 +0.1 0.07 ± 31% perf-profile.children.cycles-pp.__intel_pmu_enable_all
0.35 ± 3% +0.1 0.43 ± 7% perf-profile.children.cycles-pp._cond_resched
0.38 ± 6% +0.1 0.48 ± 6% perf-profile.children.cycles-pp.___might_sleep
0.46 ± 8% +0.1 0.60 ± 13% perf-profile.children.cycles-pp.scheduler_tick
0.66 ± 8% +0.2 0.81 ± 8% perf-profile.children.cycles-pp.rmqueue
0.70 ± 8% +0.2 0.87 ± 8% perf-profile.children.cycles-pp.alloc_pages_vma
0.79 ± 9% +0.2 0.97 ± 8% perf-profile.children.cycles-pp.get_page_from_freelist
0.83 ± 8% +0.2 1.03 ± 7% perf-profile.children.cycles-pp.__alloc_pages_nodemask
1.09 ± 6% +0.3 1.38 ± 11% perf-profile.children.cycles-pp.__hrtimer_run_queues
2.11 ± 3% +0.5 2.63 ± 12% perf-profile.children.cycles-pp.apic_timer_interrupt
28.56 ± 5% +6.6 35.18 ± 6% perf-profile.children.cycles-pp.clear_page_erms
29.39 ± 5% +6.8 36.14 ± 6% perf-profile.children.cycles-pp.clear_subpage
30.34 ± 5% +7.0 37.30 ± 6% perf-profile.children.cycles-pp.clear_huge_page
31.39 ± 5% +7.2 38.61 ± 6% perf-profile.children.cycles-pp.do_huge_pmd_anonymous_page
31.47 ± 5% +7.2 38.72 ± 6% perf-profile.children.cycles-pp.__handle_mm_fault
31.52 ± 5% +7.3 38.77 ± 6% perf-profile.children.cycles-pp.handle_mm_fault
31.60 ± 5% +7.3 38.87 ± 6% perf-profile.children.cycles-pp.do_user_addr_fault
31.67 ± 5% +7.3 38.95 ± 6% perf-profile.children.cycles-pp.page_fault
31.63 ± 5% +7.3 38.91 ± 6% perf-profile.children.cycles-pp.do_page_fault
29.16 ± 4% -4.1 25.05 ± 7% perf-profile.self.cycles-pp.do_access
27.38 -3.1 24.28 ± 7% perf-profile.self.cycles-pp.do_rw_once
0.05 ± 8% +0.0 0.07 ± 12% perf-profile.self.cycles-pp.prep_new_page
0.09 ± 9% +0.0 0.12 ± 11% perf-profile.self.cycles-pp.prepare_exit_to_usermode
0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.do_huge_pmd_anonymous_page
0.00 +0.1 0.07 ± 31% perf-profile.self.cycles-pp.__intel_pmu_enable_all
0.28 ± 3% +0.1 0.35 ± 7% perf-profile.self.cycles-pp._cond_resched
0.35 ± 4% +0.1 0.43 ± 6% perf-profile.self.cycles-pp.___might_sleep
0.48 ± 8% +0.1 0.60 ± 9% perf-profile.self.cycles-pp.rmqueue
0.88 ± 4% +0.2 1.11 ± 3% perf-profile.self.cycles-pp.clear_subpage
28.18 ± 5% +6.4 34.57 ± 6% perf-profile.self.cycles-pp.clear_page_erms
vm-scalability.median
660000 +------------------------------------------------------------------+
650000 |-+O |
| O O O O O O O |
640000 |-+ O O O O O O O O O |
630000 |-+ O O O O |
| O O O |
620000 |-+ |
610000 |-+ |
600000 |-+ +..+. .+ |
| +.. .+.. .. +. + .+.+ |
590000 |-.+. + + + +. |
580000 |.+ +..+ +.. +.. : |
| + : |
570000 |-+ + + |
560000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-rc1-00009-gfcf0553db6f4c" of type "text/plain" (152748 bytes)
View attachment "job-script" of type "text/plain" (7863 bytes)
View attachment "job.yaml" of type "text/plain" (5385 bytes)
View attachment "reproduce" of type "text/plain" (5039 bytes)
Powered by blists - more mailing lists