linux-kernel - Re: [lkp-robot] [mm, vmscan] 5e56dfbd83: fsmark.files_per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170223012734.GB31776@yexl-desktop>
Date:   Thu, 23 Feb 2017 09:27:34 +0800
From:   Ye Xiaolong <xiaolong.ye@...el.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Minchan Kim <minchan@...nel.org>,
        Hillf Danton <hillf.zj@...baba-inc.com>,
        Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: Re: [lkp-robot] [mm, vmscan]  5e56dfbd83:  fsmark.files_per_sec
 -11.1% regression

Hi, Michal

On 02/07, Michal Hocko wrote:
[snip]
>Could you retest with a single NUMA node? I am not familiar with the
>benchmark enough to judge it was set up properly for a NUMA machine.

I've retested the commit with a single NUMA node via "numactl -m 0 fs_mark xxx",
and it did help recover the performance back.

Here is the comparison:

commit/compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/md/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
  5e56dfbd837421b7fa3c6c06018c6701e2704917/gcc-6/performance/3HDD/4M/btrfs/1/x86_64-rhel-7.2/RAID5/64/debian-x86_64-2016-08-31.cgz/NoSync/ivb44/130G/fsmark
   
(with a single NUMA node)	    (2 NUMA nodes)
-------------------------------------------------------------------- 
       fail:runs   %reproduction    fail:runs
           |              |             |    
         %stddev      %change         %stddev
             \           |                \  
     57.60 ±  0%      -11.1%      51.20 ±  0%  fsmark.files_per_sec
    607.84 ±  0%       +9.0%     662.24 ±  1%  fsmark.time.elapsed_time
    607.84 ±  0%       +9.0%     662.24 ±  1%  fsmark.time.elapsed_time.max
     14317 ±  6%      -12.2%      12568 ±  7%  fsmark.time.involuntary_context_switches
      1864 ±  0%       +0.5%       1873 ±  0%  fsmark.time.maximum_resident_set_size
     12425 ±  0%      +23.3%      15320 ±  3%  fsmark.time.minor_page_faults
     33.00 ±  3%      -33.9%      21.80 ±  1%  fsmark.time.percent_of_cpu_this_job_got
    203.49 ±  3%      -28.1%     146.31 ±  1%  fsmark.time.system_time
    605701 ±  0%       +3.6%     627486 ±  0%  fsmark.time.voluntary_context_switches
    307106 ±  2%      +20.2%     368992 ±  9%  interrupts.CAL:Function_call_interrupts
    183040 ±  0%      +23.2%     225559 ±  3%  softirqs.BLOCK
     12203 ± 57%     +236.4%      41056 ±103%  softirqs.NET_RX
    186118 ±  0%      +21.9%     226922 ±  2%  softirqs.TASKLET
     14317 ±  6%      -12.2%      12568 ±  7%  time.involuntary_context_switches
     12425 ±  0%      +23.3%      15320 ±  3%  time.minor_page_faults
     33.00 ±  3%      -33.9%      21.80 ±  1%  time.percent_of_cpu_this_job_got
    203.49 ±  3%      -28.1%     146.31 ±  1%  time.system_time
      3.47 ±  3%      -13.0%       3.02 ±  1%  turbostat.%Busy
     99.60 ±  1%       -9.6%      90.00 ±  1%  turbostat.Avg_MHz
     78.69 ±  1%       +1.7%      80.01 ±  0%  turbostat.CorWatt
      3.56 ± 61%      -91.7%       0.30 ± 76%  turbostat.Pkg%pc2
    207790 ±  0%       -8.2%     190654 ±  1%  vmstat.io.bo
  30667691 ±  0%      +65.9%   50890669 ±  1%  vmstat.memory.cache
  34549892 ±  0%      -58.4%   14378939 ±  4%  vmstat.memory.free
      6768 ±  0%       -1.3%       6681 ±  1%  vmstat.system.cs
 1.089e+10 ±  2%      +13.4%  1.236e+10 ±  3%  cpuidle.C1E-IVT.time
  11475304 ±  2%      +13.4%   13007849 ±  3%  cpuidle.C1E-IVT.usage
   2.7e+09 ±  6%      +13.2%  3.057e+09 ±  3%  cpuidle.C3-IVT.time
   2954294 ±  6%      +14.3%    3375966 ±  3%  cpuidle.C3-IVT.usage
  96963295 ± 14%      +17.5%  1.139e+08 ± 12%  cpuidle.POLL.time
      8761 ±  7%      +17.6%      10299 ±  9%  cpuidle.POLL.usage
  30454483 ±  0%      +66.4%   50666102 ±  1%  meminfo.Cached

Do you see what's happening? Or is there anything we can do to improve fsmark
benchmark setup to make it more reasonable?

Thanks,
Xiaolong

>-- 
>Michal Hocko
>SUSE Labs