[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160607044817.GC16057@yexl-desktop>
Date: Tue, 7 Jun 2016 12:48:17 +0800
From: Ye Xiaolong <xiaolong.ye@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Rik van Riel <riel@...hat.com>, lkp@...org,
LKML <linux-kernel@...r.kernel.org>,
Mel Gorman <mgorman@...e.de>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [LKP] [lkp] [mm] 795ae7a0de: pixz.throughput -9.1% regression
On Mon, Jun 06, 2016 at 04:53:07PM +0800, Ye Xiaolong wrote:
>On Fri, Jun 03, 2016 at 06:21:09PM -0400, Johannes Weiner wrote:
>>On Thu, Jun 02, 2016 at 12:07:06PM -0400, Johannes Weiner wrote:
>>> Hi,
>>>
>>> On Thu, Jun 02, 2016 at 02:45:07PM +0800, kernel test robot wrote:
>>> > FYI, we noticed pixz.throughput -9.1% regression due to commit:
>>> >
>>> > commit 795ae7a0de6b834a0cc202aa55c190ef81496665 ("mm: scale kswapd watermarks in proportion to memory")
>>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>> >
>>> > in testcase: pixz
>>> > on test machine: ivb43: 48 threads Ivytown Ivy Bridge-EP with 64G memory with following parameters: cpufreq_governor=performance/nr_threads=100%
>>>
>>> Xiaolong, thanks for the report.
>>>
>>> It looks like the regression stems from a change in NUMA placement:
>>
>>Scratch that, I was misreading the test results. It's just fewer
>>allocations in general that happen during the fixed testing time.
>>
>>I'm stumped by this report. All this patch does other than affect page
>>reclaim (which is not involved here) is increase the size of the round
>>robin batches in the fair zone allocator. That should *reduce* work in
>>the page allocator, if anything.
>>
>>But I also keep failing to reproduce this - having tried it on the
>>third machine now - neither pixz nor will-it-scale/page_fault1 give me
>>that -8-9% regression:
>>
>>4.5.0-02574-g3ed3a4f:
>>PIXZ-good.log:throughput: 39908733.604941994
>>PIXZ-good.log:throughput: 37067947.25049969
>>PIXZ-good.log:throughput: 38604938.39131216
>>
>>4.5.0-02575-g795ae7a:
>> PIXZ-bad.log:throughput: 39489120.87179377
>> PIXZ-bad.log:throughput: 39307299.288432725
>> PIXZ-bad.log:throughput: 38795994.3329781
>>
>>Is this reliably reproducible with 3ed3a4f vs 795ae7ay?
>
>Yes, it can be stably reproduced in LKP environment, I've run 3ed3a4f0ddffece9
>for 4 times, and 795ae7a0de6b834a0cc202aa55 for 7 times.
>
>3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55
>---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> :4 50% 2:7 kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
> %stddev %change %stddev
> \ | \
> 78505362 ± 0% -9.2% 71298182 ± 0% pixz.throughput
>
>
>>
>>Could I ask you to retry the test with Linus's current head as well as
>>with 795ae7a reverted on top of it? (It's a clean revert.)
>>
>
>Sure, I have queued the test tasks for them, will update the comparison
>once I get the results.
FYI, below is the comparison info between 3ed3a4f, 795ae7ay, v4.7-rc2 and the
revert commit (eaa7f0d).
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase:
gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/ivb43/pixz
commit:
3ed3a4f0ddffece942bb2661924d87be4ce63cb7
795ae7a0de6b834a0cc202aa55c190ef81496665
v4.7-rc2
eaa7f0d7239a9508c616f208527cd34eb10ec90f
3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55 v4.7-rc2 eaa7f0d7239a9508c616f20852
---------------- -------------------------- -------------------------- --------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | | | |
:4 50% 2:7 0% :3 33% 1:6 kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
:4 0% :7 0% :3 33% 1:6 kmsg.igb#:#:#:exceed_max#second
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
78505362 ± 0% -9.2% 71298182 ± 0% -11.4% 69554765 ± 0% -9.4% 71129419 ± 0% pixz.throughput
5586220 ± 2% -1.6% 5498492 ± 2% +6.4% 5942217 ± 2% +9.6% 6122002 ± 0% pixz.time.involuntary_context_switches
6560113 ± 0% -0.6% 6518532 ± 0% -0.6% 6519065 ± 0% -1.3% 6475587 ± 0% pixz.time.maximum_resident_set_size
4582198 ± 2% -3.6% 4416275 ± 2% -9.0% 4167645 ± 4% -6.7% 4275069 ± 1% pixz.time.minor_page_faults
4530 ± 0% +1.0% 4575 ± 0% -1.7% 4454 ± 0% -1.5% 4463 ± 0% pixz.time.percent_of_cpu_this_job_got
92.03 ± 0% +5.6% 97.23 ± 11% +30.5% 120.08 ± 1% +30.0% 119.61 ± 0% pixz.time.system_time
14911 ± 0% +2.1% 15218 ± 0% -1.1% 14753 ± 1% -1.1% 14743 ± 0% pixz.time.user_time
6586930 ± 0% -8.4% 6033444 ± 1% -4.4% 6298874 ± 1% -2.4% 6425814 ± 0% pixz.time.voluntary_context_switches
2179703 ± 4% +4.8% 2285049 ± 2% -7.5% 2015162 ± 3% -9.9% 1963644 ± 6% softirqs.RCU
2237 ± 2% -2.9% 2172 ± 7% +21.1% 2709 ± 2% +18.6% 2653 ± 2% uptime.idle
4582198 ± 2% -3.6% 4416275 ± 2% -9.0% 4167645 ± 4% -6.7% 4275069 ± 1% time.minor_page_faults
92.03 ± 0% +5.6% 97.23 ± 11% +30.5% 120.08 ± 1% +30.0% 119.61 ± 0% time.system_time
49869 ± 1% -12.6% 43583 ± 8% -17.9% 40960 ± 0% -16.2% 41807 ± 1% vmstat.system.cs
97890 ± 1% -0.0% 97848 ± 3% +6.9% 104621 ± 2% +7.3% 105000 ± 1% vmstat.system.in
105682 ± 1% +0.6% 106297 ± 1% -85.6% 15257 ± 1% -85.1% 15716 ± 1% meminfo.Active(file)
390126 ± 0% -0.2% 389529 ± 0% +24.0% 483736 ± 0% +23.9% 483288 ± 0% meminfo.Inactive
380750 ± 0% -0.2% 380141 ± 0% +24.6% 474268 ± 0% +24.4% 473841 ± 0% meminfo.Inactive(file)
2401 ±107% +76.9% 4247 ± 79% -99.8% 5.67 ± 22% -99.7% 6.17 ± 34% numa-numastat.node0.other_node
2074670 ± 2% -11.3% 1840052 ± 11% -20.3% 1653661 ± 14% -25.6% 1543123 ± 7% numa-numastat.node1.local_node
2081648 ± 2% -11.4% 1844923 ± 11% -20.6% 1653671 ± 14% -25.9% 1543128 ± 7% numa-numastat.node1.numa_hit
6977 ± 36% -30.2% 4871 ± 66% -99.9% 10.33 ± 19% -99.9% 4.67 ± 26% numa-numastat.node1.other_node
13061458 ± 19% -3.3% 12634644 ± 24% +69.6% 22152177 ± 6% +100.0% 26127040 ± 10% cpuidle.C1-IVT.time
193807 ± 15% +26.8% 245657 ± 76% +102.5% 392490 ± 10% +91.9% 371904 ± 5% cpuidle.C1-IVT.usage
154.75 ± 10% -14.8% 131.86 ± 15% -20.1% 123.67 ± 2% -19.5% 124.50 ± 16% cpuidle.C1E-IVT.usage
8.866e+08 ± 2% -15.6% 7.479e+08 ± 6% +26.7% 1.124e+09 ± 6% +22.7% 1.088e+09 ± 3% cpuidle.C6-IVT.time
93283 ± 0% -13.2% 80988 ± 3% +20.5% 112364 ± 4% +20.3% 112255 ± 4% cpuidle.C6-IVT.usage
8559466 ± 20% -39.3% 5195127 ± 40% -100.0% 157.67 ± 36% -97.5% 211608 ±167% cpuidle.POLL.time
771388 ± 9% -53.4% 359081 ± 52% -100.0% 32.67 ± 15% -100.0% 22.50 ± 34% cpuidle.POLL.usage
94.35 ± 0% +1.0% 95.28 ± 0% -1.8% 92.69 ± 0% -1.6% 92.88 ± 0% turbostat.%Busy
2824 ± 0% +1.0% 2851 ± 0% -1.8% 2774 ± 0% -1.6% 2780 ± 0% turbostat.Avg_MHz
3.57 ± 3% -20.9% 2.83 ± 6% +16.1% 4.15 ± 6% +9.0% 3.89 ± 1% turbostat.CPU%c1
2.07 ± 3% -8.8% 1.89 ± 10% +52.3% 3.16 ± 6% +55.6% 3.23 ± 6% turbostat.CPU%c6
157.67 ± 0% -0.7% 156.51 ± 0% -1.6% 155.20 ± 0% -1.5% 155.38 ± 0% turbostat.CorWatt
0.17 ± 17% -2.9% 0.17 ± 23% +145.7% 0.43 ± 26% +121.0% 0.39 ± 21% turbostat.Pkg%pc2
192.71 ± 0% -0.8% 191.15 ± 0% -1.5% 189.80 ± 0% -1.3% 190.11 ± 0% turbostat.PkgWatt
22.36 ± 0% -8.4% 20.49 ± 0% -10.1% 20.10 ± 0% -8.4% 20.48 ± 0% turbostat.RAMWatt
632.75 ± 3% -1.6% 622.43 ± 3% -19.4% 510.00 ± 0% -19.4% 510.00 ± 0% slabinfo.RAW.active_objs
632.75 ± 3% -1.6% 622.43 ± 3% -19.4% 510.00 ± 0% -19.4% 510.00 ± 0% slabinfo.RAW.num_objs
1512 ± 1% -0.6% 1502 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.UNIX.active_objs
1512 ± 1% -0.6% 1502 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.UNIX.num_objs
766.50 ± 10% +7.5% 823.86 ± 10% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.avc_xperms_node.active_objs
766.50 ± 10% +7.5% 823.86 ± 10% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.avc_xperms_node.num_objs
13334 ± 4% +6.9% 14255 ± 4% +11.1% 14817 ± 5% +6.8% 14243 ± 7% slabinfo.kmalloc-512.num_objs
357.00 ± 2% +0.7% 359.43 ± 1% +34.5% 480.33 ± 0% +32.5% 473.17 ± 1% slabinfo.kmalloc-8192.num_objs
8080 ± 3% +1.9% 8233 ± 4% +15.1% 9300 ± 5% +16.0% 9375 ± 1% slabinfo.kmalloc-96.active_objs
8125 ± 3% +1.9% 8281 ± 4% +15.1% 9349 ± 5% +16.3% 9451 ± 1% slabinfo.kmalloc-96.num_objs
Thanks,
Xiaolong
>
>Thanks,
>Xiaolong
>
>>Thanks
>_______________________________________________
>LKP mailing list
>LKP@...ts.01.org
>https://lists.01.org/mailman/listinfo/lkp
Powered by blists - more mailing lists