lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874m6ryz0u.fsf@yhuang-mobile.sh.intel.com>
Date:	Thu, 11 Aug 2016 10:51:45 -0700
From:	"Huang\, Ying" <ying.huang@...el.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Christoph Hellwig <hch@....de>,
	"Huang\, Ying" <ying.huang@...el.com>,
	Dave Chinner <david@...morbit.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Bob Peterson <rpeterso@...hat.com>,
	"Wu Fengguang" <fengguang.wu@...el.com>, LKP <lkp@...org>
Subject: Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@....de> wrote:
>>
>> The one liner below (not tested yet) to simply remove it should fix that
>> up.  I also noticed we have a spurious pagefault_disable/enable, I
>> need to dig into the history of that first, though.
>
> Hopefully the pagefault_disable/enable doesn't matter for this case.
>
> Can we get this one-liner tested with the kernel robot for comparison?
> I really think a messed-up LRU list could cause bad IO patterns, and
> end up keeping dirty pages around that should be streaming out to disk
> and re-used, so causing memory pressure etc for no good reason.
>
> I think the mapping->tree_lock issue that Dave sees is interesting
> too, but the kswapd activity (and the extra locking it causes) could
> also be a symptom of the same thing - memory pressure due to just
> putting pages in the active file that simply shouldn't be there.
>
> So the trivial oneliner _might_ just explain things. It would be
> really nice if the regression turns out to be due to something so
> easily fixed.
>

Here is the test result for the debug patch.  It appears that the aim7
score is a little better, but the regression is not recovered.

commit 5c70fdfdf82723e47ac180e36a7638ca06ea19d8
Author: Christoph Hellwig <hch@....de>
Date:   Thu Aug 11 17:57:21 2016 +0200

    dbg fix 68a9f5e700: aim7.jobs-per-min -13.6% regression
    
    On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
    > The biggest difference is that we have "mark_page_accessed()" show up
    > after, and not before. There was also a lot of LRU noise in the
    > non-profile data. I wonder if that is the reason here: the old model
    > of using generic_perform_write/block_page_mkwrite didn't mark the
    > pages accessed, and now with iomap_file_buffered_write() they get
    > marked as active and that screws up the LRU list, and makes us not
    > flush out the dirty pages well (because they are seen as active and
    > not good for writeback), and then you get bad memory use.
    
    And that's actually a "bug" in the new code - mostly because I failed
    to pick up changes to the core code happening after we 'forked' it,
    in this case commit 2457ae ("mm: non-atomically mark page accessed during page
    cache allocation where possible").
    
    The one liner below (not tested yet) to simply remove it should fix that
    up.  I also noticed we have a spurious pagefault_disable/enable, I
    need to dig into the history of that first, though.

diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..f39c318 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -199,7 +199,6 @@ again:
 		pagefault_enable();
 
 		flush_dcache_page(page);
-		mark_page_accessed(page);
 
 		status = iomap_write_end(inode, pos, bytes, copied, page);
 		if (unlikely(status < 0))


=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
  gcc-6/performance/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7

commit: 
  f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
  68a9f5e7007c1afa2cf6830b690a90d0187c0684
  5c70fdfdf82723e47ac180e36a7638ca06ea19d8

f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 5c70fdfdf82723e47ac180e36a 
---------------- -------------------------- -------------------------- 
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
    486586 ±  0%     -13.6%     420523 ±  0%     -11.2%     432165 ±  0%  aim7.jobs-per-min
     37.23 ±  0%     +15.5%      43.02 ±  0%     +12.4%      41.87 ±  0%  aim7.time.elapsed_time
     37.23 ±  0%     +15.5%      43.02 ±  0%     +12.4%      41.87 ±  0%  aim7.time.elapsed_time.max
      6424 ±  1%     +31.3%       8437 ±  1%     +24.4%       7992 ±  3%  aim7.time.involuntary_context_switches
      2489 ±  3%      -2.5%       2425 ±  1%      -1.0%       2465 ±  3%  aim7.time.maximum_resident_set_size
    151288 ±  0%      +3.0%     155764 ±  0%      +2.0%     154252 ±  1%  aim7.time.minor_page_faults
    376.31 ±  0%     +28.4%     483.23 ±  0%     +22.5%     460.96 ±  1%  aim7.time.system_time
    429058 ±  0%     -20.1%     343013 ±  0%     -16.7%     357231 ±  0%  aim7.time.voluntary_context_switches
     16014 ±  0%     +28.3%      20545 ±  1%      -4.0%      15369 ±  0%  meminfo.Active(file)
    127154 ±  9%     -15.5%     107424 ± 10%      -4.4%     121505 ± 11%  softirqs.SCHED
     24561 ±  0%     -27.1%      17895 ±  1%     -22.5%      19033 ±  1%  vmstat.system.cs
     47289 ±  0%      +1.4%      47938 ±  0%      +1.1%      47807 ±  0%  vmstat.system.in
      4003 ±  0%     +27.8%       5117 ±  1%      -4.0%       3842 ±  0%  proc-vmstat.nr_active_file
    979.25 ±  0%     +59.1%       1558 ±  5%     -11.8%     864.00 ±  1%  proc-vmstat.pgactivate
      4699 ±  3%    +132.6%      10932 ± 78%     +77.1%       8321 ± 36%  proc-vmstat.pgpgout
      7868 ±  1%     +28.9%      10140 ±  6%      -1.8%       7724 ±  1%  numa-meminfo.node0.Active(file)
    161402 ±  1%     -21.0%     127504 ± 25%      -1.7%     158708 ±  5%  numa-meminfo.node0.Slab
      8148 ±  1%     +27.8%      10416 ±  7%      -6.1%       7648 ±  1%  numa-meminfo.node1.Active(file)
     81041 ±  3%     +43.8%     116540 ± 27%      +5.7%      85631 ±  8%  numa-meminfo.node1.Slab
     13903 ± 18%     +22.8%      17076 ±  9%     -27.8%      10039 ± 12%  numa-numastat.node0.numa_foreign
     12525 ± 20%     +27.1%      15922 ±  9%     -26.8%       9170 ± 17%  numa-numastat.node0.numa_miss
     14084 ± 18%     -35.4%       9102 ± 17%     +16.5%      16401 ±  8%  numa-numastat.node1.numa_foreign
     15461 ± 17%     -33.7%      10257 ± 14%     +11.7%      17270 ±  9%  numa-numastat.node1.numa_miss
      1966 ±  1%     +30.4%       2565 ±  4%      -1.8%       1930 ±  1%  numa-vmstat.node0.nr_active_file
      4204 ±  3%     +17.7%       4947 ±  7%      +6.1%       4461 ±  6%  numa-vmstat.node0.nr_alloc_batch
    521.75 ±  4%      +5.4%     549.80 ±  4%     -12.0%     459.00 ± 12%  numa-vmstat.node0.nr_dirty
      2037 ±  1%     +24.9%       2543 ±  5%      -6.1%       1912 ±  1%  numa-vmstat.node1.nr_active_file
     24.26 ±  0%      +8.7%      26.36 ±  0%      +7.1%      25.99 ±  0%  turbostat.%Busy
    686.75 ±  0%      +9.2%     749.80 ±  0%      +6.5%     731.67 ±  0%  turbostat.Avg_MHz
      0.29 ±  1%     -24.9%       0.22 ±  2%     -18.8%       0.23 ±  2%  turbostat.CPU%c3
     91.39 ±  2%      +3.5%      94.60 ±  0%      +4.4%      95.45 ±  1%  turbostat.CorWatt
    121.88 ±  1%      +2.6%     124.99 ±  0%      +3.9%     126.57 ±  1%  turbostat.PkgWatt
     37.23 ±  0%     +15.5%      43.02 ±  0%     +12.4%      41.87 ±  0%  time.elapsed_time
     37.23 ±  0%     +15.5%      43.02 ±  0%     +12.4%      41.87 ±  0%  time.elapsed_time.max
      6424 ±  1%     +31.3%       8437 ±  1%     +24.4%       7992 ±  3%  time.involuntary_context_switches
      1038 ±  0%     +10.5%       1148 ±  0%      +8.4%       1126 ±  0%  time.percent_of_cpu_this_job_got
    376.31 ±  0%     +28.4%     483.23 ±  0%     +22.5%     460.96 ±  1%  time.system_time
    429058 ±  0%     -20.1%     343013 ±  0%     -16.7%     357231 ±  0%  time.voluntary_context_switches
  53643508 ±  0%     -19.5%   43181295 ±  2%     -16.4%   44822518 ±  0%  cpuidle.C1-IVT.time
    318952 ±  0%     -25.7%     236993 ±  0%     -21.3%     251084 ±  0%  cpuidle.C1-IVT.usage
   3471235 ±  2%     -18.1%    2843694 ±  3%     -12.9%    3022770 ±  1%  cpuidle.C1E-IVT.time
     46642 ±  1%     -22.5%      36144 ±  0%     -17.4%      38545 ±  0%  cpuidle.C1E-IVT.usage
  12601665 ±  1%     -21.9%    9837996 ±  1%     -18.9%   10218444 ±  0%  cpuidle.C3-IVT.time
     79872 ±  1%     -19.7%      64163 ±  0%     -16.8%      66477 ±  0%  cpuidle.C3-IVT.usage
 1.292e+09 ±  0%     +13.7%  1.469e+09 ±  0%     +10.9%  1.434e+09 ±  0%  cpuidle.C6-IVT.time
      5131 ±121%     -98.5%      75.60 ±200%     -93.3%     344.33 ± 98%  latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
      2966 ± 80%     -27.5%       2151 ± 16%    +359.2%      13624 ± 77%  latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
      5131 ±121%     -98.5%      75.60 ±200%     -93.3%     344.33 ± 98%  latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
      9739 ± 99%     -99.0%      95.40 ±  9%     -99.1%      91.33 ±  5%  latency_stats.max.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
      7739 ± 81%     -71.5%       2202 ± 46%     -77.4%       1752 ± 23%  latency_stats.max.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
      5131 ±121%     -98.5%      75.60 ±200%     -93.3%     344.33 ± 98%  latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
     10459 ± 97%     -97.5%     263.80 ±  5%     -97.6%     254.67 ±  1%  latency_stats.sum.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
      9097 ± 81%     -70.2%       2708 ± 40%     -74.3%       2335 ± 21%  latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
    229.50 ± 11%      -6.7%     214.20 ± 46%     -77.8%      51.00 ±  0%  slabinfo.dio.active_objs
    229.50 ± 11%      -6.7%     214.20 ± 46%     -77.8%      51.00 ±  0%  slabinfo.dio.num_objs
    369.75 ± 17%      -6.2%     346.80 ± 11%     -21.8%     289.00 ± 16%  slabinfo.kmem_cache.active_objs
    369.75 ± 17%      -6.2%     346.80 ± 11%     -21.8%     289.00 ± 16%  slabinfo.kmem_cache.num_objs
      1022 ±  5%      -2.9%     992.80 ± 11%     -11.9%     900.33 ±  3%  slabinfo.nsproxy.active_objs
      1022 ±  5%      -2.9%     992.80 ± 11%     -11.9%     900.33 ±  3%  slabinfo.nsproxy.num_objs
      1836 ± 10%     +60.6%       2949 ±  9%     +45.1%       2665 ±  6%  slabinfo.scsi_data_buffer.active_objs
      1836 ± 10%     +60.6%       2949 ±  9%     +45.1%       2665 ±  6%  slabinfo.scsi_data_buffer.num_objs
    431.75 ± 10%     +60.7%     693.80 ±  9%     +45.1%     626.67 ±  6%  slabinfo.xfs_efd_item.active_objs
    431.75 ± 10%     +60.7%     693.80 ±  9%     +45.1%     626.67 ±  6%  slabinfo.xfs_efd_item.num_objs
  2.59e+11 ±  6%     +22.4%  3.169e+11 ±  5%     +22.9%  3.182e+11 ±  7%  perf-stat.branch-instructions
      0.41 ±  2%      -9.2%       0.38 ±  1%     -25.5%       0.31 ±  3%  perf-stat.branch-miss-rate
 1.072e+09 ±  4%     +11.3%  1.193e+09 ±  3%      -8.5%  9.812e+08 ±  4%  perf-stat.branch-misses
    972882 ±  0%     -17.4%     803235 ±  0%     -14.3%     833753 ±  0%  perf-stat.context-switches
 1.472e+12 ±  6%     +20.5%  1.774e+12 ±  5%     +16.7%  1.717e+12 ±  7%  perf-stat.cpu-cycles
    100350 ±  1%      -5.2%      95091 ±  1%      -3.8%      96553 ±  0%  perf-stat.cpu-migrations
 7.315e+08 ± 24%     +54.9%  1.133e+09 ± 35%    +107.1%  1.515e+09 ± 64%  perf-stat.dTLB-load-misses
 3.225e+11 ±  5%     +38.1%  4.454e+11 ±  3%     +32.2%  4.263e+11 ±  6%  perf-stat.dTLB-loads
 2.176e+11 ±  9%     +45.2%   3.16e+11 ±  5%     +50.1%  3.267e+11 ±  7%  perf-stat.dTLB-stores
 1.452e+12 ±  6%     +27.7%  1.853e+12 ±  5%     +28.0%  1.858e+12 ±  7%  perf-stat.instructions
     42168 ± 16%     +24.9%      52676 ±  7%     +33.4%      56234 ±  6%  perf-stat.instructions-per-iTLB-miss
      0.99 ±  0%      +5.9%       1.05 ±  0%      +9.6%       1.08 ±  0%  perf-stat.ipc
    252401 ±  0%      +6.7%     269219 ±  0%      +5.2%     265524 ±  0%  perf-stat.minor-faults
     10.16 ±  3%     +14.5%      11.63 ±  4%     +12.3%      11.41 ±  3%  perf-stat.node-store-miss-rate
  24842185 ±  2%     +11.0%   27573897 ±  2%      +7.6%   26720271 ±  3%  perf-stat.node-store-misses
 2.198e+08 ±  2%      -4.5%    2.1e+08 ±  5%      -5.6%  2.076e+08 ±  2%  perf-stat.node-stores
    252321 ±  0%      +6.6%     269092 ±  0%      +5.2%     265415 ±  0%  perf-stat.page-faults
      4.08 ± 64%     +63.1%       6.65 ± 77%    +609.5%      28.93 ± 92%  sched_debug.cfs_rq:/.MIN_vruntime.avg
    157.17 ± 61%     +30.0%     204.25 ± 47%    +686.9%       1236 ±112%  sched_debug.cfs_rq:/.MIN_vruntime.max
      4.08 ± 64%     +63.1%       6.65 ± 77%    +609.5%      28.93 ± 92%  sched_debug.cfs_rq:/.max_vruntime.avg
    157.17 ± 61%     +30.0%     204.25 ± 47%    +686.9%       1236 ±112%  sched_debug.cfs_rq:/.max_vruntime.max
    191.00 ± 35%     -23.4%     146.40 ± 15%     -28.4%     136.67 ±  5%  sched_debug.cfs_rq:/.runnable_load_avg.max
     44.43 ± 16%     -20.2%      35.47 ±  9%     -25.2%      33.25 ±  1%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
     50.23 ± 19%     -27.1%      36.61 ± 15%     -30.3%      35.03 ± 15%  sched_debug.cpu.cpu_load[1].avg
    466.50 ± 29%     -55.6%     207.20 ± 73%     -47.5%     245.00 ± 67%  sched_debug.cpu.cpu_load[1].max
     77.78 ± 33%     -52.4%      37.05 ± 53%     -48.8%      39.86 ± 51%  sched_debug.cpu.cpu_load[1].stddev
     38.82 ± 19%     -24.2%      29.42 ± 15%     -29.6%      27.31 ± 10%  sched_debug.cpu.cpu_load[2].avg
    300.50 ± 33%     -55.1%     135.00 ± 46%     -50.2%     149.67 ± 46%  sched_debug.cpu.cpu_load[2].max
     51.71 ± 41%     -49.3%      26.20 ± 46%     -50.9%      25.39 ± 33%  sched_debug.cpu.cpu_load[2].stddev
     27.08 ± 19%     -22.4%      21.02 ± 15%     -27.8%      19.56 ± 11%  sched_debug.cpu.cpu_load[3].avg
    185.00 ± 43%     -49.6%      93.20 ± 44%     -48.3%      95.67 ± 30%  sched_debug.cpu.cpu_load[3].max
     32.91 ± 50%     -46.1%      17.75 ± 45%     -49.8%      16.51 ± 25%  sched_debug.cpu.cpu_load[3].stddev
     17.78 ± 20%     -20.9%      14.06 ± 13%     -25.3%      13.28 ± 13%  sched_debug.cpu.cpu_load[4].avg
     20.29 ± 55%     -44.5%      11.26 ± 39%     -47.7%      10.61 ± 27%  sched_debug.cpu.cpu_load[4].stddev
      4929 ± 18%     -24.8%       3704 ± 23%      -4.5%       4708 ± 21%  sched_debug.cpu.nr_load_updates.avg
    276.50 ± 10%      -4.4%     264.20 ±  7%     -14.3%     237.00 ± 19%  sched_debug.cpu.nr_switches.min

Best Regards,
Huang, Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ