[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191006121901.GK17687@shao2-debian>
Date: Sun, 6 Oct 2019 20:19:01 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Waiman Long <longman@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrew Morton <akpm@...ux-foundation.org>,
Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
Davidlohr Bueso <dave@...olabs.net>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Will Deacon <will.deacon@....com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [locking/rwsem] 364f784f04: will-it-scale.per_thread_ops -8.6%
regression
Greeting,
FYI, we noticed a -8.6% regression of will-it-scale.per_thread_ops due to commit:
commit: 364f784f048c984721986db90c95ca8350213c91 ("locking/rwsem: Optimize rwsem structure for uncontended lock acquisition")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:
nr_task: 100%
mode: thread
test: page_fault2
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.6/thread/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/page_fault2/will-it-scale
commit:
a8654596f0 ("locking/rwsem: Enable lock event counting")
364f784f04 ("locking/rwsem: Optimize rwsem structure for uncontended lock acquisition")
a8654596f0371c26 364f784f048c984721986db90c9
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -2% 1:4 perf-profile.children.cycles-pp.error_entry
%stddev %change %stddev
\ | \
11003 ± 2% -8.6% 10053 will-it-scale.per_thread_ops
70653 ± 2% +11.0% 78446 ± 4% will-it-scale.time.involuntary_context_switches
6.818e+08 ± 4% -10.2% 6.124e+08 ± 3% will-it-scale.time.minor_page_faults
10405 +7.5% 11184 will-it-scale.time.percent_of_cpu_this_job_got
241.21 ± 6% -19.2% 194.93 ± 3% will-it-scale.time.user_time
5070481 ± 3% -5.2% 4807787 ± 3% will-it-scale.time.voluntary_context_switches
2112773 ± 2% -8.6% 1930300 will-it-scale.workload
6197 +0.7% 6239 boot-time.idle
1.922e+08 ± 5% -10.7% 1.716e+08 ± 2% numa-numastat.node1.local_node
1.922e+08 ± 5% -10.7% 1.716e+08 ± 2% numa-numastat.node1.numa_hit
98587680 ± 5% -10.8% 87968117 ± 2% numa-vmstat.node1.numa_hit
98470865 ± 5% -10.8% 87856550 ± 2% numa-vmstat.node1.numa_local
45.50 -4.0 41.51 mpstat.cpu.all.idle%
0.00 ± 15% +0.0 0.00 ± 7% mpstat.cpu.all.soft%
0.41 ± 5% -0.1 0.34 mpstat.cpu.all.usr%
45.25 -8.8% 41.25 vmstat.cpu.id
53.25 +8.0% 57.50 vmstat.cpu.sy
32467 -3.3% 31402 vmstat.system.cs
2321 +34.0% 3112 slabinfo.task_struct.active_objs
2335 -66.7% 778.00 slabinfo.task_struct.active_slabs
2335 +33.4% 3114 slabinfo.task_struct.num_objs
2335 -66.7% 778.00 slabinfo.task_struct.num_slabs
1611 +7.3% 1729 turbostat.Avg_MHz
43.82 ± 2% -8.2% 40.23 turbostat.CPU%c1
0.45 ± 49% -59.8% 0.18 ± 6% turbostat.Pkg%pc2
358.05 +1.8% 364.46 turbostat.PkgWatt
67050 -5.3% 63472 proc-vmstat.nr_slab_unreclaimable
6.908e+08 ± 4% -10.1% 6.212e+08 ± 2% proc-vmstat.numa_hit
6.907e+08 ± 4% -10.1% 6.211e+08 ± 2% proc-vmstat.numa_local
6.909e+08 ± 4% -10.1% 6.214e+08 ± 2% proc-vmstat.pgalloc_normal
6.829e+08 ± 4% -10.2% 6.135e+08 ± 3% proc-vmstat.pgfault
6.907e+08 ± 4% -10.1% 6.212e+08 ± 3% proc-vmstat.pgfree
36167 ± 4% -7.9% 33325 softirqs.CPU0.SCHED
21808 ± 5% -8.2% 20028 softirqs.CPU101.RCU
22463 ± 3% -13.7% 19393 ± 3% softirqs.CPU105.RCU
21973 ± 4% -8.6% 20082 ± 4% softirqs.CPU106.RCU
22115 ± 3% -10.4% 19808 ± 2% softirqs.CPU107.RCU
32888 ± 3% -8.8% 29994 ± 2% softirqs.CPU11.SCHED
32399 ± 3% -9.9% 29180 ± 2% softirqs.CPU110.SCHED
23350 ± 8% -10.8% 20817 ± 2% softirqs.CPU2.RCU
20532 ± 6% +12.3% 23062 ± 4% softirqs.CPU51.RCU
32903 ± 3% -8.8% 30006 ± 2% softirqs.CPU89.SCHED
22179 ± 5% -8.5% 20296 ± 4% softirqs.CPU97.RCU
2.357e+09 ± 2% -8.5% 2.157e+09 perf-stat.i.branch-instructions
68.70 +2.7 71.37 perf-stat.i.cache-miss-rate%
2.068e+08 -11.2% 1.837e+08 ± 4% perf-stat.i.cache-misses
2.999e+08 -14.5% 2.564e+08 ± 5% perf-stat.i.cache-references
32849 -3.2% 31781 perf-stat.i.context-switches
25.79 ± 3% +16.6% 30.09 ± 2% perf-stat.i.cpi
3.061e+11 +7.2% 3.281e+11 perf-stat.i.cpu-cycles
1472 ± 2% +21.5% 1789 ± 5% perf-stat.i.cycles-between-cache-misses
3.164e+09 ± 2% -8.0% 2.912e+09 perf-stat.i.dTLB-loads
1.14 -0.0 1.12 perf-stat.i.dTLB-store-miss-rate%
18451516 ± 2% -10.2% 16572608 perf-stat.i.dTLB-store-misses
1.582e+09 -8.3% 1.45e+09 perf-stat.i.dTLB-stores
1.18e+10 ± 2% -8.3% 1.082e+10 perf-stat.i.instructions
2303 ± 3% -7.5% 2131 perf-stat.i.instructions-per-iTLB-miss
0.04 ± 3% -9.0% 0.04 ± 4% perf-stat.i.ipc
2090997 ± 2% -8.8% 1906802 perf-stat.i.minor-faults
62234770 ± 2% -16.8% 51765588 ± 2% perf-stat.i.node-loads
25.30 ± 4% -6.6 18.65 ± 2% perf-stat.i.node-store-miss-rate%
3416019 ± 2% -28.7% 2434638 ± 2% perf-stat.i.node-store-misses
10406782 ± 2% +7.8% 11220238 perf-stat.i.node-stores
2090978 ± 2% -8.8% 1906807 perf-stat.i.page-faults
68.96 +2.8 71.71 perf-stat.overall.cache-miss-rate%
25.96 ± 3% +16.8% 30.32 ± 2% perf-stat.overall.cpi
1480 ± 2% +20.8% 1789 ± 3% perf-stat.overall.cycles-between-cache-misses
1.15 -0.0 1.13 perf-stat.overall.dTLB-store-miss-rate%
2300 ± 3% -7.3% 2131 perf-stat.overall.instructions-per-iTLB-miss
0.04 ± 3% -14.5% 0.03 ± 2% perf-stat.overall.ipc
24.72 ± 3% -6.9 17.83 ± 2% perf-stat.overall.node-store-miss-rate%
2.35e+09 ± 2% -8.5% 2.151e+09 perf-stat.ps.branch-instructions
2.062e+08 -11.2% 1.831e+08 ± 4% perf-stat.ps.cache-misses
2.99e+08 -14.5% 2.556e+08 ± 5% perf-stat.ps.cache-references
32745 -3.2% 31682 perf-stat.ps.context-switches
3.052e+11 +7.2% 3.271e+11 perf-stat.ps.cpu-cycles
3.155e+09 ± 2% -8.0% 2.903e+09 perf-stat.ps.dTLB-loads
18398373 ± 2% -10.2% 16523779 perf-stat.ps.dTLB-store-misses
1.577e+09 -8.3% 1.446e+09 perf-stat.ps.dTLB-stores
1.177e+10 ± 2% -8.3% 1.079e+10 perf-stat.ps.instructions
2084539 ± 2% -8.8% 1900793 perf-stat.ps.minor-faults
62045336 ± 2% -16.8% 51607027 ± 2% perf-stat.ps.node-loads
3405719 ± 2% -28.7% 2427181 ± 2% perf-stat.ps.node-store-misses
10375665 ± 2% +7.8% 11186150 perf-stat.ps.node-stores
2084578 ± 2% -8.8% 1900864 perf-stat.ps.page-faults
3.811e+12 ± 4% -10.1% 3.427e+12 ± 3% perf-stat.total.instructions
470.25 ± 3% -14.7% 401.00 ± 8% interrupts.CPU126.RES:Rescheduling_interrupts
454.50 ± 4% -15.3% 384.75 ± 10% interrupts.CPU127.RES:Rescheduling_interrupts
480.00 ± 6% -17.1% 398.00 ± 10% interrupts.CPU136.RES:Rescheduling_interrupts
474.50 ± 8% -17.4% 391.75 ± 5% interrupts.CPU138.RES:Rescheduling_interrupts
493.50 ± 6% -14.8% 420.25 ± 6% interrupts.CPU143.RES:Rescheduling_interrupts
554.00 -16.1% 464.75 ± 3% interrupts.CPU144.RES:Rescheduling_interrupts
476.25 ± 8% -16.0% 400.00 ± 4% interrupts.CPU148.RES:Rescheduling_interrupts
461.75 ± 5% -14.5% 395.00 ± 11% interrupts.CPU149.RES:Rescheduling_interrupts
448.75 ± 4% -12.5% 392.50 ± 10% interrupts.CPU150.RES:Rescheduling_interrupts
497.50 ± 9% -16.2% 417.00 ± 7% interrupts.CPU151.RES:Rescheduling_interrupts
476.25 ± 5% -14.8% 406.00 ± 12% interrupts.CPU153.RES:Rescheduling_interrupts
502.00 ± 5% -18.8% 407.75 ± 5% interrupts.CPU154.RES:Rescheduling_interrupts
486.75 ± 3% -14.5% 416.25 ± 8% interrupts.CPU156.RES:Rescheduling_interrupts
468.75 ± 6% -14.5% 400.75 ± 8% interrupts.CPU161.RES:Rescheduling_interrupts
465.25 ± 7% -13.1% 404.50 ± 5% interrupts.CPU165.RES:Rescheduling_interrupts
477.75 ± 8% -13.6% 413.00 ± 6% interrupts.CPU166.RES:Rescheduling_interrupts
593.00 ± 16% -25.8% 440.25 ± 8% interrupts.CPU167.RES:Rescheduling_interrupts
569.25 ± 10% -14.1% 489.25 ± 4% interrupts.CPU168.RES:Rescheduling_interrupts
496.75 ± 4% -14.5% 424.50 ± 9% interrupts.CPU175.RES:Rescheduling_interrupts
489.50 ± 10% -20.4% 389.75 ± 4% interrupts.CPU176.RES:Rescheduling_interrupts
499.50 ± 10% -24.6% 376.75 ± 13% interrupts.CPU177.RES:Rescheduling_interrupts
509.75 ± 11% -19.4% 410.75 ± 4% interrupts.CPU179.RES:Rescheduling_interrupts
476.50 ± 8% -17.2% 394.50 ± 11% interrupts.CPU180.RES:Rescheduling_interrupts
489.75 ± 10% -21.7% 383.25 ± 11% interrupts.CPU182.RES:Rescheduling_interrupts
502.00 ± 11% -16.7% 418.25 ± 10% interrupts.CPU191.RES:Rescheduling_interrupts
464.00 -12.4% 406.25 ± 6% interrupts.CPU28.RES:Rescheduling_interrupts
461.50 ± 5% -11.3% 409.50 ± 5% interrupts.CPU45.RES:Rescheduling_interrupts
564.00 ± 3% -15.7% 475.50 ± 8% interrupts.CPU48.RES:Rescheduling_interrupts
491.00 ± 7% -14.6% 419.50 ± 4% interrupts.CPU49.RES:Rescheduling_interrupts
465.75 ± 6% -14.8% 396.75 ± 8% interrupts.CPU53.RES:Rescheduling_interrupts
498.75 ± 7% -21.0% 394.00 ± 6% interrupts.CPU54.RES:Rescheduling_interrupts
474.00 ± 3% -12.7% 414.00 ± 8% interrupts.CPU57.RES:Rescheduling_interrupts
478.00 ± 7% -19.9% 382.75 ± 5% interrupts.CPU59.RES:Rescheduling_interrupts
463.50 ± 5% -16.1% 388.75 ± 10% interrupts.CPU61.RES:Rescheduling_interrupts
468.00 ± 9% -20.2% 373.25 ± 9% interrupts.CPU62.RES:Rescheduling_interrupts
465.75 ± 4% -15.5% 393.50 ± 9% interrupts.CPU64.RES:Rescheduling_interrupts
3346 ± 33% +44.2% 4826 interrupts.CPU65.NMI:Non-maskable_interrupts
3346 ± 33% +44.2% 4826 interrupts.CPU65.PMI:Performance_monitoring_interrupts
460.25 ± 5% -13.5% 398.00 ± 3% interrupts.CPU65.RES:Rescheduling_interrupts
3339 ± 33% +45.0% 4843 interrupts.CPU66.NMI:Non-maskable_interrupts
3339 ± 33% +45.0% 4843 interrupts.CPU66.PMI:Performance_monitoring_interrupts
592.50 ± 31% -30.4% 412.50 ± 6% interrupts.CPU81.RES:Rescheduling_interrupts
483.50 ± 9% -16.3% 404.50 ± 9% interrupts.CPU94.RES:Rescheduling_interrupts
92682 ± 6% -7.0% 86177 ± 5% interrupts.RES:Rescheduling_interrupts
82350 -100.0% 0.00 sched_debug.cfs_rq:/.exec_clock.avg
83827 -100.0% 0.00 sched_debug.cfs_rq:/.exec_clock.max
79902 -100.0% 0.00 sched_debug.cfs_rq:/.exec_clock.min
595.00 ± 9% -100.0% 0.00 sched_debug.cfs_rq:/.exec_clock.stddev
2512 ± 13% -38.9% 1535 ± 26% sched_debug.cfs_rq:/.load.avg
233913 ± 32% -74.7% 59149 ±126% sched_debug.cfs_rq:/.load.max
18849 ± 28% -65.2% 6566 ± 79% sched_debug.cfs_rq:/.load.stddev
9703608 ± 3% +14.9% 11145876 sched_debug.cfs_rq:/.min_vruntime.avg
9803403 ± 3% +14.5% 11221751 sched_debug.cfs_rq:/.min_vruntime.max
9410664 ± 3% +15.4% 10861526 sched_debug.cfs_rq:/.min_vruntime.min
67038 ± 5% -24.0% 50961 ± 10% sched_debug.cfs_rq:/.min_vruntime.stddev
0.49 ± 12% -100.0% 0.00 sched_debug.cfs_rq:/.nr_spread_over.avg
8.71 ± 19% -100.0% 0.00 sched_debug.cfs_rq:/.nr_spread_over.max
1.07 ± 9% -100.0% 0.00 sched_debug.cfs_rq:/.nr_spread_over.stddev
1.83 ± 18% -47.5% 0.96 ± 38% sched_debug.cfs_rq:/.runnable_load_avg.avg
224.29 ± 32% -76.4% 53.04 ±136% sched_debug.cfs_rq:/.runnable_load_avg.max
17.14 ± 30% -70.5% 5.06 ±102% sched_debug.cfs_rq:/.runnable_load_avg.stddev
2483 ± 12% -39.1% 1512 ± 26% sched_debug.cfs_rq:/.runnable_weight.avg
231331 ± 32% -75.3% 57065 ±131% sched_debug.cfs_rq:/.runnable_weight.max
18727 ± 28% -65.5% 6464 ± 80% sched_debug.cfs_rq:/.runnable_weight.stddev
67046 ± 5% -24.1% 50879 ± 11% sched_debug.cfs_rq:/.spread0.stddev
99583 ± 6% +18.9% 118389 ± 3% sched_debug.cpu.avg_idle.stddev
1.92 ± 18% -47.1% 1.02 ± 34% sched_debug.cpu.cpu_load[0].avg
224.62 ± 33% -76.0% 53.88 ±133% sched_debug.cpu.cpu_load[0].max
17.17 ± 30% -70.5% 5.07 ±100% sched_debug.cpu.cpu_load[0].stddev
2.10 ± 16% -43.1% 1.20 ± 27% sched_debug.cpu.cpu_load[1].avg
224.00 ± 32% -76.1% 53.54 ±134% sched_debug.cpu.cpu_load[1].max
17.04 ± 30% -71.0% 4.94 ±103% sched_debug.cpu.cpu_load[1].stddev
2.29 ± 14% -41.0% 1.35 ± 25% sched_debug.cpu.cpu_load[2].avg
223.12 ± 32% -75.9% 53.75 ±134% sched_debug.cpu.cpu_load[2].max
17.01 ± 30% -71.0% 4.94 ±103% sched_debug.cpu.cpu_load[2].stddev
2.39 ± 12% -40.4% 1.43 ± 24% sched_debug.cpu.cpu_load[3].avg
223.88 ± 31% -75.0% 55.92 ±128% sched_debug.cpu.cpu_load[3].max
17.05 ± 29% -70.1% 5.11 ± 98% sched_debug.cpu.cpu_load[3].stddev
2.42 ± 11% -37.3% 1.52 ± 24% sched_debug.cpu.cpu_load[4].avg
231.38 ± 28% -66.6% 77.29 ± 91% sched_debug.cpu.cpu_load[4].max
17.50 ± 27% -63.2% 6.45 ± 77% sched_debug.cpu.cpu_load[4].stddev
2464 ± 13% -40.5% 1467 ± 25% sched_debug.cpu.load.avg
233628 ± 32% -74.8% 58842 ±127% sched_debug.cpu.load.max
18747 ± 28% -65.7% 6421 ± 80% sched_debug.cpu.load.stddev
1862 ± 7% +10.3% 2055 ± 5% sched_debug.cpu.nr_switches.stddev
25907 -100.0% 0.00 sched_debug.cpu.sched_count.avg
35456 ± 2% -100.0% 0.00 sched_debug.cpu.sched_count.max
24844 -100.0% 0.00 sched_debug.cpu.sched_count.min
1410 ± 10% -100.0% 0.00 sched_debug.cpu.sched_count.stddev
12738 -100.0% 0.00 sched_debug.cpu.sched_goidle.avg
17314 -100.0% 0.00 sched_debug.cpu.sched_goidle.max
12190 -100.0% 0.00 sched_debug.cpu.sched_goidle.min
670.77 ± 6% -100.0% 0.00 sched_debug.cpu.sched_goidle.stddev
12930 -100.0% 0.00 sched_debug.cpu.ttwu_count.avg
19266 ± 4% -100.0% 0.00 sched_debug.cpu.ttwu_count.max
7000 ± 2% -100.0% 0.00 sched_debug.cpu.ttwu_count.min
2676 ± 7% -100.0% 0.00 sched_debug.cpu.ttwu_count.stddev
324.30 -100.0% 0.00 sched_debug.cpu.ttwu_local.avg
1152 ± 8% -100.0% 0.00 sched_debug.cpu.ttwu_local.max
220.62 -100.0% 0.00 sched_debug.cpu.ttwu_local.min
97.33 ± 7% -100.0% 0.00 sched_debug.cpu.ttwu_local.stddev
0.01 -75.0% 0.00 ±173% sched_debug.rt_rq:/.rt_nr_migratory.stddev
0.01 -75.0% 0.00 ±173% sched_debug.rt_rq:/.rt_nr_running.stddev
94.86 -3.3 91.51 perf-profile.calltrace.cycles-pp.page_fault.testcase
93.91 -3.0 90.86 perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.testcase
94.06 -3.0 91.02 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.testcase
53.17 -2.6 50.59 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
4.69 ± 4% -1.3 3.34 ± 3% perf-profile.calltrace.cycles-pp.put_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
9.96 ± 2% -1.0 8.99 perf-profile.calltrace.cycles-pp.up_read.__do_page_fault.do_page_fault.page_fault.testcase
8.49 -0.9 7.57 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
4.38 ± 3% -0.7 3.71 ± 2% perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
2.99 ± 4% -0.6 2.38 ± 2% perf-profile.calltrace.cycles-pp.secondary_startup_64
2.98 ± 4% -0.6 2.36 ± 2% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
2.98 ± 4% -0.6 2.37 ± 2% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
2.98 ± 4% -0.6 2.37 ± 2% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
2.53 ± 5% -0.5 2.02 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
3.22 ± 4% -0.5 2.75 ± 3% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault
2.04 ± 6% -0.4 1.60 ± 3% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
2.08 ± 6% -0.4 1.64 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault
1.96 ± 6% -0.4 1.54 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte
1.45 -0.3 1.16 perf-profile.calltrace.cycles-pp.unlock_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
0.67 ± 3% -0.1 0.57 perf-profile.calltrace.cycles-pp.lru_cache_add_active_or_unevictable.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
0.83 -0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
0.83 -0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
0.83 -0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.munmap
0.83 -0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
0.83 -0.1 0.76 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
0.82 -0.1 0.75 ± 3% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.78 -0.1 0.73 ± 2% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
0.74 -0.1 0.69 ± 2% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
0.75 -0.1 0.70 ± 3% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
0.87 -0.0 0.82 perf-profile.calltrace.cycles-pp.__pagevec_lru_add_fn.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault
4.00 +0.2 4.17 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault
4.03 +0.2 4.21 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault
95.80 +0.5 96.28 perf-profile.calltrace.cycles-pp.testcase
14.11 +1.7 15.76 perf-profile.calltrace.cycles-pp.copy_page.copy_user_highpage.__handle_mm_fault.handle_mm_fault.__do_page_fault
14.42 +1.7 16.16 perf-profile.calltrace.cycles-pp.copy_user_highpage.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
0.25 ±173% +4.0 4.24 ± 8% perf-profile.calltrace.cycles-pp.handle_mm_fault.testcase
94.62 -3.1 91.53 perf-profile.children.cycles-pp.page_fault
93.93 -3.0 90.88 perf-profile.children.cycles-pp.__do_page_fault
94.08 -3.0 91.03 perf-profile.children.cycles-pp.do_page_fault
53.30 -2.6 50.69 perf-profile.children.cycles-pp.__handle_mm_fault
4.71 ± 4% -1.3 3.37 ± 3% perf-profile.children.cycles-pp.put_page
10.07 ± 2% -1.0 9.11 perf-profile.children.cycles-pp.up_read
8.49 -0.9 7.58 perf-profile.children.cycles-pp.finish_fault
4.40 ± 3% -0.7 3.71 ± 2% perf-profile.children.cycles-pp.__lru_cache_add
2.99 ± 4% -0.6 2.38 ± 2% perf-profile.children.cycles-pp.secondary_startup_64
2.99 ± 4% -0.6 2.38 ± 2% perf-profile.children.cycles-pp.cpu_startup_entry
2.99 ± 4% -0.6 2.38 ± 2% perf-profile.children.cycles-pp.do_idle
2.98 ± 4% -0.6 2.37 ± 2% perf-profile.children.cycles-pp.start_secondary
2.58 ± 4% -0.5 2.07 ± 3% perf-profile.children.cycles-pp.cpuidle_enter_state
3.23 ± 4% -0.5 2.75 ± 3% perf-profile.children.cycles-pp.pagevec_lru_move_fn
2.17 ± 6% -0.5 1.71 ± 4% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
2.05 ± 6% -0.4 1.60 ± 3% perf-profile.children.cycles-pp.intel_idle
1.46 -0.3 1.18 perf-profile.children.cycles-pp.unlock_page
1.73 ± 3% -0.3 1.47 ± 6% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
1.46 ± 3% -0.2 1.23 ± 8% perf-profile.children.cycles-pp.hrtimer_interrupt
1.12 ± 2% -0.2 0.91 ± 7% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.90 -0.2 0.69 ± 9% perf-profile.children.cycles-pp.tick_sched_timer
0.85 ± 2% -0.2 0.64 ± 9% perf-profile.children.cycles-pp.update_process_times
0.86 ± 2% -0.2 0.65 ± 9% perf-profile.children.cycles-pp.tick_sched_handle
0.73 ± 4% -0.2 0.53 ± 7% perf-profile.children.cycles-pp.scheduler_tick
0.67 ± 5% -0.2 0.48 ± 7% perf-profile.children.cycles-pp.task_tick_fair
0.57 ± 7% -0.1 0.45 perf-profile.children.cycles-pp.native_irq_return_iret
0.68 ± 2% -0.1 0.58 ± 2% perf-profile.children.cycles-pp.lru_cache_add_active_or_unevictable
0.28 ± 8% -0.1 0.20 ± 4% perf-profile.children.cycles-pp.update_cfs_group
0.92 -0.1 0.85 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.92 -0.1 0.85 ± 3% perf-profile.children.cycles-pp.do_syscall_64
0.82 -0.1 0.75 ± 3% perf-profile.children.cycles-pp.__do_munmap
0.83 -0.1 0.76 ± 2% perf-profile.children.cycles-pp.munmap
0.83 -0.1 0.76 ± 2% perf-profile.children.cycles-pp.__vm_munmap
0.83 -0.1 0.76 ± 2% perf-profile.children.cycles-pp.__x64_sys_munmap
0.78 -0.1 0.73 ± 2% perf-profile.children.cycles-pp.unmap_region
0.75 -0.0 0.70 ± 3% perf-profile.children.cycles-pp.unmap_vmas
0.75 -0.0 0.70 ± 3% perf-profile.children.cycles-pp.unmap_page_range
0.26 ± 6% -0.0 0.22 ± 8% perf-profile.children.cycles-pp.___might_sleep
0.93 -0.0 0.89 perf-profile.children.cycles-pp.__pagevec_lru_add_fn
0.18 ± 6% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.menu_select
0.23 ± 5% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.__sched_text_start
0.23 ± 3% -0.0 0.20 ± 4% perf-profile.children.cycles-pp.irq_exit
0.38 ± 3% -0.0 0.35 perf-profile.children.cycles-pp.tlb_flush_mmu_free
0.18 ± 2% -0.0 0.15 ± 5% perf-profile.children.cycles-pp.__softirqentry_text_start
0.11 ± 6% -0.0 0.09 ± 7% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.25 ± 3% -0.0 0.23 ± 3% perf-profile.children.cycles-pp.free_unref_page_list
0.10 ± 5% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.schedule_idle
0.17 ± 2% +0.0 0.21 ± 6% perf-profile.children.cycles-pp.page_add_new_anon_rmap
0.21 ± 5% +0.0 0.26 ± 9% perf-profile.children.cycles-pp.page_mapping
0.00 +0.1 0.05 ± 9% perf-profile.children.cycles-pp._cond_resched
4.01 +0.2 4.18 perf-profile.children.cycles-pp.find_lock_entry
4.04 +0.2 4.22 perf-profile.children.cycles-pp.shmem_getpage_gfp
0.03 ±173% +0.3 0.37 perf-profile.children.cycles-pp.mem_cgroup_from_task
96.04 +0.7 96.73 perf-profile.children.cycles-pp.testcase
14.42 +1.8 16.17 perf-profile.children.cycles-pp.copy_user_highpage
14.36 +1.8 16.11 perf-profile.children.cycles-pp.copy_page
54.36 +3.6 57.95 perf-profile.children.cycles-pp.handle_mm_fault
11.18 -1.7 9.46 perf-profile.self.cycles-pp.__handle_mm_fault
4.64 ± 4% -1.3 3.32 ± 3% perf-profile.self.cycles-pp.put_page
9.93 ± 2% -0.9 9.00 perf-profile.self.cycles-pp.up_read
2.04 ± 6% -0.4 1.60 ± 3% perf-profile.self.cycles-pp.intel_idle
1.44 -0.3 1.17 perf-profile.self.cycles-pp.unlock_page
1.16 -0.2 0.95 perf-profile.self.cycles-pp.__lru_cache_add
0.57 ± 7% -0.1 0.45 perf-profile.self.cycles-pp.native_irq_return_iret
0.67 ± 2% -0.1 0.57 ± 2% perf-profile.self.cycles-pp.lru_cache_add_active_or_unevictable
0.28 ± 8% -0.1 0.20 ± 4% perf-profile.self.cycles-pp.update_cfs_group
0.29 ± 3% -0.1 0.20 ± 9% perf-profile.self.cycles-pp.task_tick_fair
0.82 ± 2% -0.1 0.76 perf-profile.self.cycles-pp.__pagevec_lru_add_fn
0.60 ± 4% -0.1 0.54 ± 3% perf-profile.self.cycles-pp.testcase
0.26 ± 5% -0.0 0.22 ± 6% perf-profile.self.cycles-pp.___might_sleep
0.18 ± 2% -0.0 0.15 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.08 ± 15% -0.0 0.06 perf-profile.self.cycles-pp.switch_mm_irqs_off
0.12 ± 4% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.free_pcppages_bulk
0.18 ± 2% +0.0 0.22 ± 5% perf-profile.self.cycles-pp.__mod_node_page_state
0.60 +0.1 0.71 perf-profile.self.cycles-pp.get_page_from_freelist
0.86 ± 3% +0.1 0.98 ± 7% perf-profile.self.cycles-pp.find_lock_entry
0.00 +0.1 0.13 ± 6% perf-profile.self.cycles-pp.mem_cgroup_from_task
14.13 +1.8 15.90 perf-profile.self.cycles-pp.copy_page
0.99 ± 52% +6.0 7.01 ± 4% perf-profile.self.cycles-pp.handle_mm_fault
will-it-scale.per_thread_ops
12000 +-+-----------------------------------------------------------------+
| +.. |
|+ .+.. + + |
11500 +-+ +. .+..+.+ +.+. :: :: |
| + +..+ : : : : +. + |
| + : : : : : +.. : : |
11000 +-+ + + : : : : |
| + + + |
10500 +-+ |
| O O O |
| O O O O O O O O O |
10000 +-+ O O O O O O O O
O O O O O O O O O |
| O |
9500 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.1.0-rc4-00070-g364f784f0" of type "text/plain" (188810 bytes)
View attachment "job-script" of type "text/plain" (7389 bytes)
View attachment "job.yaml" of type "text/plain" (5077 bytes)
View attachment "reproduce" of type "text/plain" (315 bytes)
Powered by blists - more mailing lists