lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20170411014608.GU17682@yexl-desktop>
Date:   Tue, 11 Apr 2017 09:46:09 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Filipe Manana <fdmanana@...e.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: [lkp-robot] [Btrfs]  4bcecb33b4:  fsmark.files_per_sec -65%
 regression


Greeting,

FYI, we noticed a -65% regression of fsmark.files_per_sec due to commit:


commit: 4bcecb33b4bc919e0a0383d97d9d4508c8cf78b8 ("Btrfs: fix reported number of inode blocks")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: fsmark
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 128G memory
with following parameters:

	iterations: 1x
	nr_threads: 64t
	disk: 8BRD_12G
	md: RAID0
	fs: btrfs
	filesize: 4M
	test_size: 60G
	sync_method: fsyncBeforeClose
	cpufreq_governor: performance

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/01org/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

testcase/path_params/tbox_group/run: fsmark/1x-64t-8BRD_12G-RAID0-btrfs-4M-60G-fsyncBeforeClose-performance/lkp-hsx02

2a2bb87c2c221d7d  4bcecb33b4bc919e0a0383d97d  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
    474957              -6%     444428        fsmark.app_overhead
      1061             -65%        370 ±  4%  fsmark.files_per_sec
    558421             781%    4919293 ±  4%  fsmark.time.voluntary_context_switches
     14.70             186%      42.10 ±  4%  fsmark.time.elapsed_time
     14.70             186%      42.10 ±  4%  fsmark.time.elapsed_time.max
      1210 ±  5%       153%       3067 ±  5%  fsmark.time.involuntary_context_switches
       564             120%       1243 ±  3%  fsmark.time.system_time
      3839             -23%       2955        fsmark.time.percent_of_cpu_this_job_got
     26843 ±  8%       138%      63940 ±  5%  interrupts.CAL:Function_call_interrupts
     15703             -61%       6157 ±  4%  iostat.md0.w/s
   3758783             -61%    1460865 ±  4%  iostat.md0.wkB/s
   3675077             -61%    1430247 ±  4%  vmstat.io.bo
    109228             264%     397596        vmstat.system.cs
     57567              -4%      55284        vmstat.system.in
       779              12%        870        turbostat.Avg_MHz
     27.29              11%      30.17        turbostat.%Busy
       413               4%        430        turbostat.PkgWatt
     59.62              -4%      57.16        turbostat.RAMWatt
   1856908             852%   17672157 ±  5%  perf-stat.context-switches
      1179 ±  7%       571%       7907 ±  5%  perf-stat.cpu-migrations
  18478436 ±  3%       494%  1.098e+08 ±  5%  perf-stat.iTLB-loads
   1.8e+11             240%  6.126e+11 ±  5%  perf-stat.branch-instructions
 1.771e+12 ±  3%       205%  5.395e+12 ±  5%  perf-stat.cpu-cycles
 1.889e+08             174%   5.17e+08 ±  3%  perf-stat.node-load-misses
 7.372e+11             154%   1.87e+12 ±  4%  perf-stat.instructions
 1.752e+11 ± 12%       137%   4.15e+11 ±  7%  perf-stat.dTLB-loads
 1.195e+08             134%  2.796e+08        perf-stat.node-store-misses
  1.91e+08 ±  3%       121%  4.225e+08 ±  3%  perf-stat.branch-misses
     77270             113%     164378 ±  3%  perf-stat.page-faults
     77270             113%     164378 ±  3%  perf-stat.minor-faults
    395924 ± 38%       111%     836529 ± 16%  perf-stat.instructions-per-iTLB-miss
 5.534e+08              91%  1.057e+09 ±  3%  perf-stat.cache-misses
 2.066e+09              60%  3.299e+09        perf-stat.cache-references
 4.201e+08 ± 13%        47%  6.178e+08 ± 15%  perf-stat.dTLB-load-misses
  42780444 ± 10%        42%   60856921 ±  3%  perf-stat.dTLB-store-misses
 4.081e+10 ±  6%        38%   5.63e+10        perf-stat.dTLB-stores
 1.468e+08              33%  1.951e+08 ±  3%  perf-stat.node-stores
     44.86              31%      58.91        perf-stat.node-store-miss-rate%
     70.58              30%      91.96        perf-stat.node-load-miss-rate%
     26.80              20%      32.05        perf-stat.cache-miss-rate%
      0.42             -17%       0.35        perf-stat.ipc
      0.11             -35%       0.07        perf-stat.branch-miss-rate%
      0.24 ±  6%       -38%       0.15 ± 10%  perf-stat.dTLB-load-miss-rate%
  78732161 ±  4%       -43%   45125639        perf-stat.node-loads




                               fsmark.files_per_sec

  1200 ++-------------------------------------------------------------------+
       |                                      *      *          *  *   *    |
  1100 **** **     *  ****** **** **** **.** * ****** ******** * ** *** * ***
  1000 ++  *  ***** **      *    *    *     *                 *          *  |
       |                                                                    |
   900 ++                                                                   |
   800 ++                                                                   |
       |                                                                    |
   700 ++                                                                   |
   600 ++                                                                   |
       |                                                                    |
   500 ++                                                                   |
   400 ++                                                                   |
       |OOOOOOOOOOOOOOOO O OOOOOOOOOOO                                      |
   300 O+---------------O-O-------------------------------------------------+



  [*] bisect-good sample
  [O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.10.0-rc8-00201-g4bcecb3" of type "text/plain" (102826 bytes)

View attachment "job-script" of type "text/plain" (7224 bytes)

View attachment "job.yaml" of type "text/plain" (4721 bytes)

View attachment "reproduce" of type "text/plain" (1607 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ