[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1441161631.3138.7.camel@intel.com>
Date: Wed, 02 Sep 2015 10:40:31 +0800
From: Huang Ying <ying.huang@...el.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: Andy Lutomirski <luto@...nel.org>, lkp@...org,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>,
Denys Vlasenko <dvlasenk@...hat.com>
Subject: Re: [lkp] [x86/build] b2c51106c75: -18.1%
will-it-scale.per_process_ops
On Wed, 2015-08-05 at 10:38 +0200, Ingo Molnar wrote:
> * kernel test robot <ying.huang@...el.com> wrote:
>
> > FYI, we noticed the below changes on
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm
> > commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection of GCC -mpreferred-stack-boundary support")
>
> Does the performance regression go away reproducibly if you do:
>
> git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9
>
> ?
Sorry for reply so late!
Revert the commit will restore part of the performance, as below.
parent commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68
the commit: b2c51106c7581866c37ffc77c5d739f3d4b7cbc9
revert commit: 987d12601a4a82cc2f2151b1be704723eb84cb9d
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test:
wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2
commit:
f2a50f8b7da45ff2de93a71393e715a2ab9f3b68
b2c51106c7581866c37ffc77c5d739f3d4b7cbc9
987d12601a4a82cc2f2151b1be704723eb84cb9d
f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 987d12601a4a82cc2f2151b1be
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
879002 ± 0% -18.1% 720270 ± 7% -3.6% 847011 ± 2% will-it-scale.per_process_ops
0.02 ± 0% +34.5% 0.02 ± 7% +5.6% 0.02 ± 2% will-it-scale.scalability
11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% will-it-scale.time.minor_page_faults
769.30 ± 0% -0.9% 762.15 ± 0% +1.1% 777.42 ± 0% will-it-scale.time.system_time
26153173 ± 0% +7.0% 27977076 ± 0% +3.5% 27078124 ± 0% will-it-scale.time.voluntary_context_switches
2964 ± 2% +1.4% 3004 ± 1% -51.9% 1426 ± 2% proc-vmstat.pgactivate
0.06 ± 27% +154.5% 0.14 ± 44% +122.7% 0.12 ± 24% turbostat.CPU%c3
370683 ± 0% +6.2% 393491 ± 0% +2.4% 379575 ± 0% vmstat.system.cs
11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% time.minor_page_faults
15.70 ± 2% +14.5% 17.98 ± 0% +1.5% 15.94 ± 1% time.user_time
830343 ± 56% -54.0% 382128 ± 39% -22.3% 645308 ± 65% cpuidle.C1E-NHM.time
788.25 ± 14% -21.7% 617.25 ± 16% -12.3% 691.00 ± 3% cpuidle.C1E-NHM.usage
2489132 ± 20% +79.3% 4464147 ± 33% +78.4% 4440574 ± 21% cpuidle.C3-NHM.time
1082762 ±162% -100.0% 0.00 ± -1% +189.3% 3132030 ±110% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
102189 ± 2% -2.1% 100087 ± 5% -32.9% 68568 ± 2% latency_stats.hits.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
1082762 ±162% -100.0% 0.00 ± -1% +289.6% 4217977 ±109% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
1082762 ±162% -100.0% 0.00 ± -1% +478.5% 6264061 ±110% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
5.10 ± 2% -8.0% 4.69 ± 1% +13.0% 5.76 ± 1% perf-profile.cpu-cycles.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency
2.58 ± 8% +19.5% 3.09 ± 3% -1.8% 2.54 ± 11% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry
7.02 ± 3% +9.2% 7.67 ± 2% +7.1% 7.52 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry
3.07 ± 2% +14.8% 3.53 ± 3% -1.4% 3.03 ± 5% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp
3.05 ± 5% -8.4% 2.79 ± 4% -5.2% 2.90 ± 5% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry
0.89 ± 5% -7.6% 0.82 ± 3% +16.3% 1.03 ± 5% perf-profile.cpu-cycles.is_ftrace_trampoline.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk
0.98 ± 3% -25.1% 0.74 ± 7% -16.8% 0.82 ± 2% perf-profile.cpu-cycles.is_ftrace_trampoline.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency
1.58 ± 3% -5.2% 1.50 ± 2% +44.2% 2.28 ± 1% perf-profile.cpu-cycles.is_module_text_address.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk
1.82 ± 18% +46.6% 2.67 ± 3% -32.6% 1.23 ± 56% perf-profile.cpu-cycles.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page
8.05 ± 3% +9.5% 8.82 ± 3% +5.4% 8.49 ± 2% perf-profile.cpu-cycles.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp
1.16 ± 2% +6.9% 1.25 ± 5% +11.4% 1.30 ± 5% perf-profile.cpu-cycles.put_page.shmem_file_read_iter.__vfs_read.vfs_read.sys_read
11102 ± 1% +0.0% 11102 ± 1% -95.8% 468.00 ± 0% slabinfo.Acpi-ParseExt.active_objs
198.25 ± 1% +0.0% 198.25 ± 1% -93.9% 12.00 ± 0% slabinfo.Acpi-ParseExt.active_slabs
11102 ± 1% +0.0% 11102 ± 1% -95.8% 468.00 ± 0% slabinfo.Acpi-ParseExt.num_objs
198.25 ± 1% +0.0% 198.25 ± 1% -93.9% 12.00 ± 0% slabinfo.Acpi-ParseExt.num_slabs
341.25 ± 14% +2.9% 351.00 ± 11% -100.0% 0.00 ± -1% slabinfo.blkdev_ioc.active_objs
341.25 ± 14% +2.9% 351.00 ± 11% -100.0% 0.00 ± -1% slabinfo.blkdev_ioc.num_objs
438.00 ± 16% -8.3% 401.50 ± 20% -100.0% 0.00 ± -1% slabinfo.file_lock_ctx.active_objs
438.00 ± 16% -8.3% 401.50 ± 20% -100.0% 0.00 ± -1% slabinfo.file_lock_ctx.num_objs
4398 ± 1% +1.4% 4462 ± 0% -14.5% 3761 ± 2% slabinfo.ftrace_event_field.active_objs
4398 ± 1% +1.4% 4462 ± 0% -14.5% 3761 ± 2% slabinfo.ftrace_event_field.num_objs
3947 ± 2% +10.6% 4363 ± 3% +107.1% 8175 ± 2% slabinfo.kmalloc-192.active_objs
93.00 ± 2% +10.8% 103.00 ± 3% +120.2% 204.75 ± 2% slabinfo.kmalloc-192.active_slabs
3947 ± 2% +10.6% 4363 ± 3% +118.4% 8620 ± 2% slabinfo.kmalloc-192.num_objs
93.00 ± 2% +10.8% 103.00 ± 3% +120.2% 204.75 ± 2% slabinfo.kmalloc-192.num_slabs
1794 ± 0% +3.2% 1851 ± 2% +12.2% 2012 ± 3% slabinfo.trace_event_file.active_objs
1794 ± 0% +3.2% 1851 ± 2% +12.2% 2012 ± 3% slabinfo.trace_event_file.num_objs
7065 ± 7% -5.4% 6684 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.active_objs
160.50 ± 7% -5.5% 151.75 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.active_slabs
7091 ± 7% -5.6% 6694 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.num_objs
160.50 ± 7% -5.5% 151.75 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.num_slabs
857.50 ± 29% +75.7% 1506 ± 78% +157.6% 2209 ± 33% sched_debug.cfs_rq[11]:/.blocked_load_avg
52.75 ± 29% -29.4% 37.25 ± 60% +103.3% 107.25 ± 43% sched_debug.cfs_rq[11]:/.load
914.50 ± 29% +69.9% 1553 ± 77% +155.6% 2337 ± 32% sched_debug.cfs_rq[11]:/.tg_load_contrib
7.75 ± 34% -64.5% 2.75 ± 64% -12.9% 6.75 ±115% sched_debug.cfs_rq[2]:/.nr_spread_over
1135 ± 20% -43.6% 640.75 ± 49% -18.8% 922.50 ± 51% sched_debug.cfs_rq[3]:/.blocked_load_avg
1215 ± 21% -43.1% 691.50 ± 46% -21.3% 956.25 ± 50% sched_debug.cfs_rq[3]:/.tg_load_contrib
38.50 ± 21% +129.9% 88.50 ± 36% +96.1% 75.50 ± 56% sched_debug.cfs_rq[4]:/.load
26.00 ± 20% +98.1% 51.50 ± 46% +142.3% 63.00 ± 53% sched_debug.cfs_rq[4]:/.runnable_load_avg
128.25 ± 18% +227.5% 420.00 ± 43% +152.4% 323.75 ± 68% sched_debug.cfs_rq[4]:/.utilization_load_avg
28320 ± 12% -6.3% 26545 ± 11% -19.4% 22813 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
1015 ± 78% +101.1% 2042 ± 25% +64.4% 1669 ± 73% sched_debug.cfs_rq[6]:/.blocked_load_avg
1069 ± 72% +100.2% 2140 ± 23% +61.2% 1722 ± 70% sched_debug.cfs_rq[6]:/.tg_load_contrib
619.25 ± 12% -6.3% 580.25 ± 11% -19.2% 500.25 ± 13% sched_debug.cfs_rq[6]:/.tg_runnable_contrib
88.75 ± 14% -47.3% 46.75 ± 36% -24.5% 67.00 ± 11% sched_debug.cfs_rq[9]:/.load
59.25 ± 23% -41.4% 34.75 ± 34% -6.3% 55.50 ± 12% sched_debug.cfs_rq[9]:/.runnable_load_avg
315.50 ± 45% -64.6% 111.67 ± 1% -12.1% 277.25 ± 3% sched_debug.cfs_rq[9]:/.utilization_load_avg
2246758 ± 7% +87.6% 4213925 ± 65% -2.2% 2197475 ± 4% sched_debug.cpu#0.nr_switches
2249376 ± 7% +87.4% 4215969 ± 65% -2.2% 2199216 ± 4% sched_debug.cpu#0.sched_count
1121438 ± 7% +81.0% 2030313 ± 61% -2.2% 1096479 ± 4% sched_debug.cpu#0.sched_goidle
1151160 ± 7% +86.5% 2146608 ± 64% -1.9% 1129264 ± 3% sched_debug.cpu#0.ttwu_count
33.75 ± 15% -22.2% 26.25 ± 6% -8.9% 30.75 ± 10% sched_debug.cpu#1.cpu_load[3]
33.25 ± 10% -18.0% 27.25 ± 7% -3.8% 32.00 ± 11% sched_debug.cpu#1.cpu_load[4]
41.75 ± 29% +23.4% 51.50 ± 33% +53.9% 64.25 ± 16% sched_debug.cpu#10.cpu_load[1]
40.00 ± 18% +24.4% 49.75 ± 18% +49.4% 59.75 ± 8% sched_debug.cpu#10.cpu_load[2]
39.25 ± 14% +22.3% 48.00 ± 10% +38.9% 54.50 ± 7% sched_debug.cpu#10.cpu_load[3]
39.50 ± 15% +20.3% 47.50 ± 6% +30.4% 51.50 ± 7% sched_debug.cpu#10.cpu_load[4]
5269004 ± 1% +27.8% 6731790 ± 30% +1.4% 5342560 ± 2% sched_debug.cpu#10.nr_switches
5273193 ± 1% +27.8% 6736526 ± 30% +1.4% 5345791 ± 2% sched_debug.cpu#10.sched_count
2633974 ± 1% +27.8% 3365271 ± 30% +1.4% 2670901 ± 2% sched_debug.cpu#10.sched_goidle
2644149 ± 1% +26.9% 3356318 ± 30% +1.9% 2693295 ± 1% sched_debug.cpu#10.ttwu_count
26.50 ± 37% +116.0% 57.25 ± 48% +109.4% 55.50 ± 29% sched_debug.cpu#11.cpu_load[0]
30.75 ± 15% +66.7% 51.25 ± 31% +65.9% 51.00 ± 21% sched_debug.cpu#11.cpu_load[1]
33.50 ± 10% +37.3% 46.00 ± 22% +39.6% 46.75 ± 17% sched_debug.cpu#11.cpu_load[2]
37.00 ± 11% +15.5% 42.75 ± 19% +29.7% 48.00 ± 11% sched_debug.cpu#11.cpu_load[4]
508300 ± 11% -0.6% 505024 ± 1% +18.1% 600291 ± 7% sched_debug.cpu#4.avg_idle
454696 ± 9% -5.9% 427894 ± 25% +21.8% 553608 ± 4% sched_debug.cpu#5.avg_idle
66.00 ± 27% +11.0% 73.25 ± 37% -46.6% 35.25 ± 22% sched_debug.cpu#6.cpu_load[0]
62.00 ± 36% +12.5% 69.75 ± 45% -41.5% 36.25 ± 11% sched_debug.cpu#6.cpu_load[1]
247681 ± 19% +21.0% 299747 ± 10% +28.7% 318764 ± 17% sched_debug.cpu#8.avg_idle
5116609 ± 4% +34.5% 6884238 ± 33% +55.2% 7942254 ± 34% sched_debug.cpu#9.nr_switches
5120531 ± 4% +34.5% 6889156 ± 33% +55.2% 7945270 ± 34% sched_debug.cpu#9.sched_count
2557822 ± 4% +34.5% 3441428 ± 33% +55.2% 3970337 ± 34% sched_debug.cpu#9.sched_goidle
2565307 ± 4% +32.9% 3410042 ± 33% +54.0% 3949696 ± 34% sched_debug.cpu#9.ttwu_count
0.00 ±141% +4.2e+05% 4.76 ±173% +47.7% 0.00 ±-59671% sched_debug.rt_rq[10]:/.rt_time
155259 ± 0% +0.0% 155259 ± 0% -42.2% 89723 ± 0% sched_debug.sysctl_sched.sysctl_sched_features
Best Regards,
Huang, Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists