lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 7 Jun 2016 12:48:17 +0800
From:	Ye Xiaolong <xiaolong.ye@...el.com>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Rik van Riel <riel@...hat.com>, lkp@...org,
	LKML <linux-kernel@...r.kernel.org>,
	Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [LKP] [lkp] [mm] 795ae7a0de: pixz.throughput -9.1% regression

On Mon, Jun 06, 2016 at 04:53:07PM +0800, Ye Xiaolong wrote:
>On Fri, Jun 03, 2016 at 06:21:09PM -0400, Johannes Weiner wrote:
>>On Thu, Jun 02, 2016 at 12:07:06PM -0400, Johannes Weiner wrote:
>>> Hi,
>>> 
>>> On Thu, Jun 02, 2016 at 02:45:07PM +0800, kernel test robot wrote:
>>> > FYI, we noticed pixz.throughput -9.1% regression due to commit:
>>> > 
>>> > commit 795ae7a0de6b834a0cc202aa55c190ef81496665 ("mm: scale kswapd watermarks in proportion to memory")
>>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>> > 
>>> > in testcase: pixz
>>> > on test machine: ivb43: 48 threads Ivytown Ivy Bridge-EP with 64G memory with following parameters: cpufreq_governor=performance/nr_threads=100%
>>> 
>>> Xiaolong, thanks for the report.
>>> 
>>> It looks like the regression stems from a change in NUMA placement:
>>
>>Scratch that, I was misreading the test results. It's just fewer
>>allocations in general that happen during the fixed testing time.
>>
>>I'm stumped by this report. All this patch does other than affect page
>>reclaim (which is not involved here) is increase the size of the round
>>robin batches in the fair zone allocator. That should *reduce* work in
>>the page allocator, if anything.
>>
>>But I also keep failing to reproduce this - having tried it on the
>>third machine now - neither pixz nor will-it-scale/page_fault1 give me
>>that -8-9% regression:
>>
>>4.5.0-02574-g3ed3a4f:
>>PIXZ-good.log:throughput: 39908733.604941994
>>PIXZ-good.log:throughput: 37067947.25049969
>>PIXZ-good.log:throughput: 38604938.39131216
>>
>>4.5.0-02575-g795ae7a:
>> PIXZ-bad.log:throughput: 39489120.87179377
>> PIXZ-bad.log:throughput: 39307299.288432725
>> PIXZ-bad.log:throughput: 38795994.3329781
>>
>>Is this reliably reproducible with 3ed3a4f vs 795ae7ay?
>
>Yes, it can be stably reproduced in LKP environment, I've run 3ed3a4f0ddffece9 
>for 4 times, and 795ae7a0de6b834a0cc202aa55 for 7 times.
>
>3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55
>---------------- --------------------------
>       fail:runs  %reproduction    fail:runs
>           |             |             |
>           :4           50%           2:7     kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
>         %stddev     %change         %stddev
>             \          |                \
>  78505362 ±  0%      -9.2%   71298182 ±  0%  pixz.throughput
>
>
>>
>>Could I ask you to retry the test with Linus's current head as well as
>>with 795ae7a reverted on top of it? (It's a clean revert.)
>>
>
>Sure, I have queued the test tasks for them, will update the comparison
>once I get the results.

FYI, below is the comparison info between 3ed3a4f, 795ae7ay, v4.7-rc2 and the
revert commit (eaa7f0d).

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase:
  gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/ivb43/pixz

commit: 
  3ed3a4f0ddffece942bb2661924d87be4ce63cb7
  795ae7a0de6b834a0cc202aa55c190ef81496665
  v4.7-rc2
  eaa7f0d7239a9508c616f208527cd34eb10ec90f

3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55                   v4.7-rc2 eaa7f0d7239a9508c616f20852
---------------- -------------------------- -------------------------- --------------------------
       fail:runs  %reproduction    fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
           |             |             |             |             |             |             |
           :4           50%           2:7            0%            :3           33%           1:6     kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
           :4            0%            :7            0%            :3           33%           1:6     kmsg.igb#:#:#:exceed_max#second
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \
  78505362 ±  0%      -9.2%   71298182 ±  0%     -11.4%   69554765 ±  0%      -9.4%   71129419 ±  0%  pixz.throughput
   5586220 ±  2%      -1.6%    5498492 ±  2%      +6.4%    5942217 ±  2%      +9.6%    6122002 ±  0%  pixz.time.involuntary_context_switches
   6560113 ±  0%      -0.6%    6518532 ±  0%      -0.6%    6519065 ±  0%      -1.3%    6475587 ±  0%  pixz.time.maximum_resident_set_size
   4582198 ±  2%      -3.6%    4416275 ±  2%      -9.0%    4167645 ±  4%      -6.7%    4275069 ±  1%  pixz.time.minor_page_faults
      4530 ±  0%      +1.0%       4575 ±  0%      -1.7%       4454 ±  0%      -1.5%       4463 ±  0%  pixz.time.percent_of_cpu_this_job_got
     92.03 ±  0%      +5.6%      97.23 ± 11%     +30.5%     120.08 ±  1%     +30.0%     119.61 ±  0%  pixz.time.system_time
     14911 ±  0%      +2.1%      15218 ±  0%      -1.1%      14753 ±  1%      -1.1%      14743 ±  0%  pixz.time.user_time
   6586930 ±  0%      -8.4%    6033444 ±  1%      -4.4%    6298874 ±  1%      -2.4%    6425814 ±  0%  pixz.time.voluntary_context_switches
   2179703 ±  4%      +4.8%    2285049 ±  2%      -7.5%    2015162 ±  3%      -9.9%    1963644 ±  6%  softirqs.RCU
      2237 ±  2%      -2.9%       2172 ±  7%     +21.1%       2709 ±  2%     +18.6%       2653 ±  2%  uptime.idle
   4582198 ±  2%      -3.6%    4416275 ±  2%      -9.0%    4167645 ±  4%      -6.7%    4275069 ±  1%  time.minor_page_faults
     92.03 ±  0%      +5.6%      97.23 ± 11%     +30.5%     120.08 ±  1%     +30.0%     119.61 ±  0%  time.system_time
     49869 ±  1%     -12.6%      43583 ±  8%     -17.9%      40960 ±  0%     -16.2%      41807 ±  1%  vmstat.system.cs
     97890 ±  1%      -0.0%      97848 ±  3%      +6.9%     104621 ±  2%      +7.3%     105000 ±  1%  vmstat.system.in
    105682 ±  1%      +0.6%     106297 ±  1%     -85.6%      15257 ±  1%     -85.1%      15716 ±  1%  meminfo.Active(file)
    390126 ±  0%      -0.2%     389529 ±  0%     +24.0%     483736 ±  0%     +23.9%     483288 ±  0%  meminfo.Inactive
    380750 ±  0%      -0.2%     380141 ±  0%     +24.6%     474268 ±  0%     +24.4%     473841 ±  0%  meminfo.Inactive(file)
      2401 ±107%     +76.9%       4247 ± 79%     -99.8%       5.67 ± 22%     -99.7%       6.17 ± 34%  numa-numastat.node0.other_node
   2074670 ±  2%     -11.3%    1840052 ± 11%     -20.3%    1653661 ± 14%     -25.6%    1543123 ±  7%  numa-numastat.node1.local_node
   2081648 ±  2%     -11.4%    1844923 ± 11%     -20.6%    1653671 ± 14%     -25.9%    1543128 ±  7%  numa-numastat.node1.numa_hit
      6977 ± 36%     -30.2%       4871 ± 66%     -99.9%      10.33 ± 19%     -99.9%       4.67 ± 26%  numa-numastat.node1.other_node
  13061458 ± 19%      -3.3%   12634644 ± 24%     +69.6%   22152177 ±  6%    +100.0%   26127040 ± 10%  cpuidle.C1-IVT.time
    193807 ± 15%     +26.8%     245657 ± 76%    +102.5%     392490 ± 10%     +91.9%     371904 ±  5%  cpuidle.C1-IVT.usage
    154.75 ± 10%     -14.8%     131.86 ± 15%     -20.1%     123.67 ±  2%     -19.5%     124.50 ± 16%  cpuidle.C1E-IVT.usage
 8.866e+08 ±  2%     -15.6%  7.479e+08 ±  6%     +26.7%  1.124e+09 ±  6%     +22.7%  1.088e+09 ±  3%  cpuidle.C6-IVT.time
     93283 ±  0%     -13.2%      80988 ±  3%     +20.5%     112364 ±  4%     +20.3%     112255 ±  4%  cpuidle.C6-IVT.usage
   8559466 ± 20%     -39.3%    5195127 ± 40%    -100.0%     157.67 ± 36%     -97.5%     211608 ±167%  cpuidle.POLL.time
    771388 ±  9%     -53.4%     359081 ± 52%    -100.0%      32.67 ± 15%    -100.0%      22.50 ± 34%  cpuidle.POLL.usage
     94.35 ±  0%      +1.0%      95.28 ±  0%      -1.8%      92.69 ±  0%      -1.6%      92.88 ±  0%  turbostat.%Busy
      2824 ±  0%      +1.0%       2851 ±  0%      -1.8%       2774 ±  0%      -1.6%       2780 ±  0%  turbostat.Avg_MHz
      3.57 ±  3%     -20.9%       2.83 ±  6%     +16.1%       4.15 ±  6%      +9.0%       3.89 ±  1%  turbostat.CPU%c1
      2.07 ±  3%      -8.8%       1.89 ± 10%     +52.3%       3.16 ±  6%     +55.6%       3.23 ±  6%  turbostat.CPU%c6
    157.67 ±  0%      -0.7%     156.51 ±  0%      -1.6%     155.20 ±  0%      -1.5%     155.38 ±  0%  turbostat.CorWatt
      0.17 ± 17%      -2.9%       0.17 ± 23%    +145.7%       0.43 ± 26%    +121.0%       0.39 ± 21%  turbostat.Pkg%pc2
    192.71 ±  0%      -0.8%     191.15 ±  0%      -1.5%     189.80 ±  0%      -1.3%     190.11 ±  0%  turbostat.PkgWatt
     22.36 ±  0%      -8.4%      20.49 ±  0%     -10.1%      20.10 ±  0%      -8.4%      20.48 ±  0%  turbostat.RAMWatt
    632.75 ±  3%      -1.6%     622.43 ±  3%     -19.4%     510.00 ±  0%     -19.4%     510.00 ±  0%  slabinfo.RAW.active_objs
    632.75 ±  3%      -1.6%     622.43 ±  3%     -19.4%     510.00 ±  0%     -19.4%     510.00 ±  0%  slabinfo.RAW.num_objs
      1512 ±  1%      -0.6%       1502 ±  1%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.UNIX.active_objs
      1512 ±  1%      -0.6%       1502 ±  1%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.UNIX.num_objs
    766.50 ± 10%      +7.5%     823.86 ± 10%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.avc_xperms_node.active_objs
    766.50 ± 10%      +7.5%     823.86 ± 10%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.avc_xperms_node.num_objs
     13334 ±  4%      +6.9%      14255 ±  4%     +11.1%      14817 ±  5%      +6.8%      14243 ±  7%  slabinfo.kmalloc-512.num_objs
    357.00 ±  2%      +0.7%     359.43 ±  1%     +34.5%     480.33 ±  0%     +32.5%     473.17 ±  1%  slabinfo.kmalloc-8192.num_objs
      8080 ±  3%      +1.9%       8233 ±  4%     +15.1%       9300 ±  5%     +16.0%       9375 ±  1%  slabinfo.kmalloc-96.active_objs
      8125 ±  3%      +1.9%       8281 ±  4%     +15.1%       9349 ±  5%     +16.3%       9451 ±  1%  slabinfo.kmalloc-96.num_objs

Thanks,
Xiaolong

>
>Thanks,
>Xiaolong
>
>>Thanks
>_______________________________________________
>LKP mailing list
>LKP@...ts.01.org
>https://lists.01.org/mailman/listinfo/lkp

Powered by blists - more mailing lists