lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 7 Jun 2016 12:48:17 +0800 From: Ye Xiaolong <xiaolong.ye@...el.com> To: Johannes Weiner <hannes@...xchg.org> Cc: Rik van Riel <riel@...hat.com>, lkp@...org, LKML <linux-kernel@...r.kernel.org>, Mel Gorman <mgorman@...e.de>, David Rientjes <rientjes@...gle.com>, Joonsoo Kim <iamjoonsoo.kim@....com>, Linus Torvalds <torvalds@...ux-foundation.org>, Andrew Morton <akpm@...ux-foundation.org> Subject: Re: [LKP] [lkp] [mm] 795ae7a0de: pixz.throughput -9.1% regression On Mon, Jun 06, 2016 at 04:53:07PM +0800, Ye Xiaolong wrote: >On Fri, Jun 03, 2016 at 06:21:09PM -0400, Johannes Weiner wrote: >>On Thu, Jun 02, 2016 at 12:07:06PM -0400, Johannes Weiner wrote: >>> Hi, >>> >>> On Thu, Jun 02, 2016 at 02:45:07PM +0800, kernel test robot wrote: >>> > FYI, we noticed pixz.throughput -9.1% regression due to commit: >>> > >>> > commit 795ae7a0de6b834a0cc202aa55c190ef81496665 ("mm: scale kswapd watermarks in proportion to memory") >>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >>> > >>> > in testcase: pixz >>> > on test machine: ivb43: 48 threads Ivytown Ivy Bridge-EP with 64G memory with following parameters: cpufreq_governor=performance/nr_threads=100% >>> >>> Xiaolong, thanks for the report. >>> >>> It looks like the regression stems from a change in NUMA placement: >> >>Scratch that, I was misreading the test results. It's just fewer >>allocations in general that happen during the fixed testing time. >> >>I'm stumped by this report. All this patch does other than affect page >>reclaim (which is not involved here) is increase the size of the round >>robin batches in the fair zone allocator. That should *reduce* work in >>the page allocator, if anything. >> >>But I also keep failing to reproduce this - having tried it on the >>third machine now - neither pixz nor will-it-scale/page_fault1 give me >>that -8-9% regression: >> >>4.5.0-02574-g3ed3a4f: >>PIXZ-good.log:throughput: 39908733.604941994 >>PIXZ-good.log:throughput: 37067947.25049969 >>PIXZ-good.log:throughput: 38604938.39131216 >> >>4.5.0-02575-g795ae7a: >> PIXZ-bad.log:throughput: 39489120.87179377 >> PIXZ-bad.log:throughput: 39307299.288432725 >> PIXZ-bad.log:throughput: 38795994.3329781 >> >>Is this reliably reproducible with 3ed3a4f vs 795ae7ay? > >Yes, it can be stably reproduced in LKP environment, I've run 3ed3a4f0ddffece9 >for 4 times, and 795ae7a0de6b834a0cc202aa55 for 7 times. > >3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55 >---------------- -------------------------- > fail:runs %reproduction fail:runs > | | | > :4 50% 2:7 kmsg.Spurious_LAPIC_timer_interrupt_on_cpu > %stddev %change %stddev > \ | \ > 78505362 ± 0% -9.2% 71298182 ± 0% pixz.throughput > > >> >>Could I ask you to retry the test with Linus's current head as well as >>with 795ae7a reverted on top of it? (It's a clean revert.) >> > >Sure, I have queued the test tasks for them, will update the comparison >once I get the results. FYI, below is the comparison info between 3ed3a4f, 795ae7ay, v4.7-rc2 and the revert commit (eaa7f0d). ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase: gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/ivb43/pixz commit: 3ed3a4f0ddffece942bb2661924d87be4ce63cb7 795ae7a0de6b834a0cc202aa55c190ef81496665 v4.7-rc2 eaa7f0d7239a9508c616f208527cd34eb10ec90f 3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55 v4.7-rc2 eaa7f0d7239a9508c616f20852 ---------------- -------------------------- -------------------------- -------------------------- fail:runs %reproduction fail:runs %reproduction fail:runs %reproduction fail:runs | | | | | | | :4 50% 2:7 0% :3 33% 1:6 kmsg.Spurious_LAPIC_timer_interrupt_on_cpu :4 0% :7 0% :3 33% 1:6 kmsg.igb#:#:#:exceed_max#second %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 78505362 ± 0% -9.2% 71298182 ± 0% -11.4% 69554765 ± 0% -9.4% 71129419 ± 0% pixz.throughput 5586220 ± 2% -1.6% 5498492 ± 2% +6.4% 5942217 ± 2% +9.6% 6122002 ± 0% pixz.time.involuntary_context_switches 6560113 ± 0% -0.6% 6518532 ± 0% -0.6% 6519065 ± 0% -1.3% 6475587 ± 0% pixz.time.maximum_resident_set_size 4582198 ± 2% -3.6% 4416275 ± 2% -9.0% 4167645 ± 4% -6.7% 4275069 ± 1% pixz.time.minor_page_faults 4530 ± 0% +1.0% 4575 ± 0% -1.7% 4454 ± 0% -1.5% 4463 ± 0% pixz.time.percent_of_cpu_this_job_got 92.03 ± 0% +5.6% 97.23 ± 11% +30.5% 120.08 ± 1% +30.0% 119.61 ± 0% pixz.time.system_time 14911 ± 0% +2.1% 15218 ± 0% -1.1% 14753 ± 1% -1.1% 14743 ± 0% pixz.time.user_time 6586930 ± 0% -8.4% 6033444 ± 1% -4.4% 6298874 ± 1% -2.4% 6425814 ± 0% pixz.time.voluntary_context_switches 2179703 ± 4% +4.8% 2285049 ± 2% -7.5% 2015162 ± 3% -9.9% 1963644 ± 6% softirqs.RCU 2237 ± 2% -2.9% 2172 ± 7% +21.1% 2709 ± 2% +18.6% 2653 ± 2% uptime.idle 4582198 ± 2% -3.6% 4416275 ± 2% -9.0% 4167645 ± 4% -6.7% 4275069 ± 1% time.minor_page_faults 92.03 ± 0% +5.6% 97.23 ± 11% +30.5% 120.08 ± 1% +30.0% 119.61 ± 0% time.system_time 49869 ± 1% -12.6% 43583 ± 8% -17.9% 40960 ± 0% -16.2% 41807 ± 1% vmstat.system.cs 97890 ± 1% -0.0% 97848 ± 3% +6.9% 104621 ± 2% +7.3% 105000 ± 1% vmstat.system.in 105682 ± 1% +0.6% 106297 ± 1% -85.6% 15257 ± 1% -85.1% 15716 ± 1% meminfo.Active(file) 390126 ± 0% -0.2% 389529 ± 0% +24.0% 483736 ± 0% +23.9% 483288 ± 0% meminfo.Inactive 380750 ± 0% -0.2% 380141 ± 0% +24.6% 474268 ± 0% +24.4% 473841 ± 0% meminfo.Inactive(file) 2401 ±107% +76.9% 4247 ± 79% -99.8% 5.67 ± 22% -99.7% 6.17 ± 34% numa-numastat.node0.other_node 2074670 ± 2% -11.3% 1840052 ± 11% -20.3% 1653661 ± 14% -25.6% 1543123 ± 7% numa-numastat.node1.local_node 2081648 ± 2% -11.4% 1844923 ± 11% -20.6% 1653671 ± 14% -25.9% 1543128 ± 7% numa-numastat.node1.numa_hit 6977 ± 36% -30.2% 4871 ± 66% -99.9% 10.33 ± 19% -99.9% 4.67 ± 26% numa-numastat.node1.other_node 13061458 ± 19% -3.3% 12634644 ± 24% +69.6% 22152177 ± 6% +100.0% 26127040 ± 10% cpuidle.C1-IVT.time 193807 ± 15% +26.8% 245657 ± 76% +102.5% 392490 ± 10% +91.9% 371904 ± 5% cpuidle.C1-IVT.usage 154.75 ± 10% -14.8% 131.86 ± 15% -20.1% 123.67 ± 2% -19.5% 124.50 ± 16% cpuidle.C1E-IVT.usage 8.866e+08 ± 2% -15.6% 7.479e+08 ± 6% +26.7% 1.124e+09 ± 6% +22.7% 1.088e+09 ± 3% cpuidle.C6-IVT.time 93283 ± 0% -13.2% 80988 ± 3% +20.5% 112364 ± 4% +20.3% 112255 ± 4% cpuidle.C6-IVT.usage 8559466 ± 20% -39.3% 5195127 ± 40% -100.0% 157.67 ± 36% -97.5% 211608 ±167% cpuidle.POLL.time 771388 ± 9% -53.4% 359081 ± 52% -100.0% 32.67 ± 15% -100.0% 22.50 ± 34% cpuidle.POLL.usage 94.35 ± 0% +1.0% 95.28 ± 0% -1.8% 92.69 ± 0% -1.6% 92.88 ± 0% turbostat.%Busy 2824 ± 0% +1.0% 2851 ± 0% -1.8% 2774 ± 0% -1.6% 2780 ± 0% turbostat.Avg_MHz 3.57 ± 3% -20.9% 2.83 ± 6% +16.1% 4.15 ± 6% +9.0% 3.89 ± 1% turbostat.CPU%c1 2.07 ± 3% -8.8% 1.89 ± 10% +52.3% 3.16 ± 6% +55.6% 3.23 ± 6% turbostat.CPU%c6 157.67 ± 0% -0.7% 156.51 ± 0% -1.6% 155.20 ± 0% -1.5% 155.38 ± 0% turbostat.CorWatt 0.17 ± 17% -2.9% 0.17 ± 23% +145.7% 0.43 ± 26% +121.0% 0.39 ± 21% turbostat.Pkg%pc2 192.71 ± 0% -0.8% 191.15 ± 0% -1.5% 189.80 ± 0% -1.3% 190.11 ± 0% turbostat.PkgWatt 22.36 ± 0% -8.4% 20.49 ± 0% -10.1% 20.10 ± 0% -8.4% 20.48 ± 0% turbostat.RAMWatt 632.75 ± 3% -1.6% 622.43 ± 3% -19.4% 510.00 ± 0% -19.4% 510.00 ± 0% slabinfo.RAW.active_objs 632.75 ± 3% -1.6% 622.43 ± 3% -19.4% 510.00 ± 0% -19.4% 510.00 ± 0% slabinfo.RAW.num_objs 1512 ± 1% -0.6% 1502 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.UNIX.active_objs 1512 ± 1% -0.6% 1502 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.UNIX.num_objs 766.50 ± 10% +7.5% 823.86 ± 10% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.avc_xperms_node.active_objs 766.50 ± 10% +7.5% 823.86 ± 10% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% slabinfo.avc_xperms_node.num_objs 13334 ± 4% +6.9% 14255 ± 4% +11.1% 14817 ± 5% +6.8% 14243 ± 7% slabinfo.kmalloc-512.num_objs 357.00 ± 2% +0.7% 359.43 ± 1% +34.5% 480.33 ± 0% +32.5% 473.17 ± 1% slabinfo.kmalloc-8192.num_objs 8080 ± 3% +1.9% 8233 ± 4% +15.1% 9300 ± 5% +16.0% 9375 ± 1% slabinfo.kmalloc-96.active_objs 8125 ± 3% +1.9% 8281 ± 4% +15.1% 9349 ± 5% +16.3% 9451 ± 1% slabinfo.kmalloc-96.num_objs Thanks, Xiaolong > >Thanks, >Xiaolong > >>Thanks >_______________________________________________ >LKP mailing list >LKP@...ts.01.org >https://lists.01.org/mailman/listinfo/lkp
Powered by blists - more mailing lists