lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 20 Apr 2022 15:11:27 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Dave Chinner <dchinner@...hat.com>
Cc:     "Darrick J. Wong" <djwong@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com
Subject: [xfs]  0b02c8c0d7:  fio.write_iops -9.0% regression



Greeting,

FYI, we noticed a -9.0% regression of fio.write_iops due to commit:


commit: 0b02c8c0d75a738c98c35f02efb36217c170d78c ("xfs: set prealloc flag in xfs_alloc_file_space()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fio-basic
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:

	runtime: 300s
	disk: 1HDD
	fs: xfs
	nr_task: 100%
	test_size: 128G
	rw: write
	bs: 4k
	ioengine: falloc
	cpufreq_governor: performance
	ucode: 0xb000280

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio

In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------+
| testcase: change | fio-basic: fio.write_iops 19.5% improvement    |
| test machine     | 96 threads 2 sockets Ice Lake with 256G memory |
| test parameters  | bs=4k                                          |
|                  | cpufreq_governor=performance                   |
|                  | disk=1HDD                                      |
|                  | fs=xfs                                         |
|                  | ioengine=falloc                                |
|                  | nr_task=1                                      |
|                  | runtime=300s                                   |
|                  | rw=write                                       |
|                  | test_size=128G                                 |
|                  | ucode=0xb000280                                |
+------------------+------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
  4k/gcc-9/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/300s/write/lkp-icl-2sp1/128G/fio-basic/0xb000280

commit: 
  fbe7e52003 ("xfs: fallocate() should call file_modified()")
  0b02c8c0d7 ("xfs: set prealloc flag in xfs_alloc_file_space()")

fbe7e520036583a7 0b02c8c0d75a738c98c35f02efb 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     80.55 ± 15%     +10.4       90.94        fio.latency_100us%
      0.19 ± 43%      -0.2        0.01 ± 32%  fio.latency_10us%
      1.01 ±117%      -1.0        0.04 ± 63%  fio.latency_20us%
      8.06 ±106%      -8.0        0.08 ± 16%  fio.latency_50us%
     15740 ±  5%     +15.7%      18205        fio.time.involuntary_context_switches
      8976 ±  4%      +4.9%       9412        fio.time.percent_of_cpu_this_job_got
      2542 ±  4%     +15.4%       2932        fio.time.system_time
      4637            -9.0%       4220        fio.write_bw_MBps
    100010 ±  9%     -10.2%      89770 ±  2%  fio.write_clat_90%_us
    144384 ±  2%     +12.5%     162474        fio.write_clat_95%_us
     76605 ±  4%     +15.4%      88388        fio.write_clat_mean_us
   1187140            -9.0%    1080402        fio.write_iops
    358691            +1.7%     364745        proc-vmstat.numa_hit
    271640            +2.2%     277686        proc-vmstat.numa_local
    358538            +1.8%     364830        proc-vmstat.pgalloc_normal
      7.22 ±119%      -5.3        1.88 ±161%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
      7.22 ±119%      -5.2        2.07 ±165%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      8.08 ±114%      -3.8        4.26 ±144%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      8.08 ±114%      -3.8        4.26 ±144%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      8.08 ±114%      -3.8        4.26 ±144%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
      8.21 ±114%      -3.8        4.43 ±143%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      6.68 ±130%      -0.4        6.23 ±147%  perf-profile.calltrace.cycles-pp._IO_vfscanf.fscanf
      7.22 ±119%      -5.0        2.24 ±156%  perf-profile.children.cycles-pp.cpuidle_enter
      7.22 ±119%      -5.0        2.24 ±156%  perf-profile.children.cycles-pp.cpuidle_enter_state
      8.08 ±114%      -3.8        4.26 ±144%  perf-profile.children.cycles-pp.start_secondary
      8.21 ±114%      -3.8        4.43 ±143%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      8.21 ±114%      -3.8        4.43 ±143%  perf-profile.children.cycles-pp.cpu_startup_entry
      8.21 ±114%      -3.8        4.43 ±143%  perf-profile.children.cycles-pp.do_idle
      3.94 ± 78%      -2.8        1.14 ±185%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      7.72 ±116%      -1.5        6.23 ±147%  perf-profile.children.cycles-pp._IO_vfscanf
      7.72 ±116%      -2.8        4.94 ±141%  perf-profile.self.cycles-pp._IO_vfscanf
      9.56 ±  6%     -20.3%       7.62 ±  4%  perf-stat.i.MPKI
      0.87 ±  9%      -0.3        0.55 ±  4%  perf-stat.i.branch-miss-rate%
  28854751 ±  3%     -39.1%   17572480 ±  2%  perf-stat.i.branch-misses
     33.09 ±  3%     +15.2       48.25        perf-stat.i.cache-miss-rate%
  50957571 ±  2%      +8.6%   55343067        perf-stat.i.cache-misses
 1.531e+08           -25.6%  1.139e+08 ±  2%  perf-stat.i.cache-references
     16.91           +12.3%      18.99        perf-stat.i.cpi
      5508            -4.7%       5250        perf-stat.i.cycles-between-cache-misses
    218292 ± 55%     -48.4%     112740 ± 31%  perf-stat.i.dTLB-load-misses
 2.158e+09           -25.1%  1.616e+09        perf-stat.i.dTLB-stores
      0.06 ±  2%     -10.9%       0.06 ±  2%  perf-stat.i.ipc
    173.60            -9.1%     157.82        perf-stat.i.major-faults
    286.45 ±  3%     +10.7%     316.97 ±  3%  perf-stat.i.metric.K/sec
    108.46 ±  4%     -10.8%      96.72        perf-stat.i.metric.M/sec
      5465            -6.7%       5101 ±  2%  perf-stat.i.minor-faults
     93.79            +2.0       95.78        perf-stat.i.node-load-miss-rate%
  10120472 ±  3%     +15.0%   11643033        perf-stat.i.node-load-misses
    530906 ±  2%     -33.0%     355501 ±  3%  perf-stat.i.node-loads
     65.02            -3.3       61.74        perf-stat.i.node-store-miss-rate%
   5639450 ±  3%     +15.0%    6488073        perf-stat.i.node-stores
      5639            -6.7%       5259 ±  2%  perf-stat.i.page-faults
      9.30 ±  6%     -19.8%       7.46 ±  2%  perf-stat.overall.MPKI
      0.83 ±  9%      -0.3        0.52 ±  2%  perf-stat.overall.branch-miss-rate%
     33.30 ±  3%     +15.3       48.61        perf-stat.overall.cache-miss-rate%
     17.26           +11.8%      19.30        perf-stat.overall.cpi
      5589            -4.7%       5326        perf-stat.overall.cycles-between-cache-misses
      0.00 ± 61%      -0.0        0.00 ± 31%  perf-stat.overall.dTLB-load-miss-rate%
      0.06           -10.6%       0.05        perf-stat.overall.ipc
     95.01            +2.0       97.04        perf-stat.overall.node-load-miss-rate%
     65.59            -3.4       62.19        perf-stat.overall.node-store-miss-rate%
  27877289 ±  3%     -38.9%   17030231 ±  2%  perf-stat.ps.branch-misses
  49235817 ±  2%      +9.0%   53642585        perf-stat.ps.cache-misses
 1.479e+08           -25.4%  1.104e+08 ±  2%  perf-stat.ps.cache-references
    210467 ± 55%     -48.2%     109086 ± 31%  perf-stat.ps.dTLB-load-misses
 2.085e+09           -24.9%  1.566e+09        perf-stat.ps.dTLB-stores
    166.72            -8.8%     151.97        perf-stat.ps.major-faults
      5267            -6.4%       4932 ±  2%  perf-stat.ps.minor-faults
   9778653 ±  3%     +15.4%   11285466        perf-stat.ps.node-load-misses
    512925 ±  2%     -32.8%     344562 ±  3%  perf-stat.ps.node-loads
   5448424 ±  3%     +15.4%    6288141        perf-stat.ps.node-stores
      5434            -6.4%       5084 ±  2%  perf-stat.ps.page-faults


***************************************************************************************************
lkp-icl-2sp1: 96 threads 2 sockets Ice Lake with 256G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
  4k/gcc-9/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-10.4-x86_64-20200603.cgz/300s/write/lkp-icl-2sp1/128G/fio-basic/0xb000280

commit: 
  fbe7e52003 ("xfs: fallocate() should call file_modified()")
  0b02c8c0d7 ("xfs: set prealloc flag in xfs_alloc_file_space()")

fbe7e520036583a7 0b02c8c0d75a738c98c35f02efb 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.23 ± 16%      -0.1        0.16 ± 19%  fio.latency_10us%
     58.12           -16.1%      48.74 ±  2%  fio.time.elapsed_time
     58.12           -16.1%      48.74 ±  2%  fio.time.elapsed_time.max
     45.53           -20.1%      36.37 ±  2%  fio.time.system_time
      6226           -15.9%       5235 ±  2%  fio.time.voluntary_context_switches
      2268           +19.5%       2709 ±  2%  fio.write_bw_MBps
      1562           -17.9%       1282        fio.write_clat_90%_us
      1594           -17.6%       1314        fio.write_clat_95%_us
      1688           -17.2%       1397        fio.write_clat_99%_us
      1491           -18.7%       1212 ±  2%  fio.write_clat_mean_us
    284.55           -11.4%     252.08 ±  2%  fio.write_clat_stddev
    580741           +19.5%     693756 ±  2%  fio.write_iops
      0.33 ±  6%      +0.1        0.40 ±  5%  mpstat.cpu.all.usr%
     23764 ±105%    +160.9%      62001 ± 31%  numa-numastat.node1.other_node
     23764 ±105%    +160.9%      62001 ± 31%  numa-vmstat.node1.numa_other
 5.635e+09           -16.4%   4.71e+09 ±  2%  cpuidle..time
  11466994           -16.3%    9598650 ±  2%  cpuidle..usage
     20.26 ± 24%     +57.5%      31.91 ± 14%  sched_debug.cfs_rq:/.util_est_enqueued.avg
     92.40 ± 15%     +32.5%     122.41 ± 12%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    113.13           -10.2%     101.64 ±  2%  uptime.boot
     10239           -10.7%       9143 ±  2%  uptime.idle
     11391 ±  6%     -20.5%       9059 ±  9%  turbostat.C1
  11372369           -22.9%    8768716 ± 20%  turbostat.C1E
  11580573           -16.3%    9697445 ±  2%  turbostat.IRQ
      1229 ± 11%     -31.3%     845.17 ± 23%  turbostat.POLL
    403186            -4.6%     384712        proc-vmstat.numa_hit
    316491            -5.8%     298019        proc-vmstat.numa_local
    403264            -4.7%     384256        proc-vmstat.pgalloc_normal
    319706            -9.1%     290634        proc-vmstat.pgfault
     17842           -10.6%      15948        proc-vmstat.pgreuse
      7.85 ± 93%      -3.0        4.89 ±186%  perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      6.24 ±100%      -2.1        4.17 ±141%  perf-profile.calltrace.cycles-pp.asm_exc_page_fault.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter
      5.81 ± 93%      -1.2        4.58 ±154%  perf-profile.calltrace.cycles-pp.telldir
      7.85 ± 93%      -3.0        4.89 ±186%  perf-profile.children.cycles-pp.do_fault
      4.24 ±104%      -0.7        3.53 ±144%  perf-profile.children.cycles-pp.number
      4.52 ± 79%      +0.1        4.58 ±154%  perf-profile.children.cycles-pp.telldir
      4.24 ±104%      -0.7        3.53 ±144%  perf-profile.self.cycles-pp.number
   4524366 ±  3%    +298.3%   18020724 ±164%  perf-stat.i.cache-references
    102.84            +2.9%     105.79        perf-stat.i.cpu-migrations
      9.80           +22.4%      12.00 ±  3%  perf-stat.i.major-faults
     49.44 ±  3%    +278.1%     186.93 ±161%  perf-stat.i.metric.K/sec
     32.06 ± 11%      +8.6       40.64 ± 14%  perf-stat.overall.node-store-miss-rate%
     15257           -15.1%      12959        perf-stat.overall.path-length
   4447276 ±  3%    +296.8%   17647918 ±164%  perf-stat.ps.cache-references
    101.10            +2.5%     103.65        perf-stat.ps.cpu-migrations
      9.62           +21.9%      11.73 ±  2%  perf-stat.ps.major-faults
  5.12e+11           -15.1%  4.348e+11        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.17.0-rc1-00004-g0b02c8c0d75a" of type "text/plain" (161090 bytes)

View attachment "job-script" of type "text/plain" (8449 bytes)

View attachment "job.yaml" of type "text/plain" (5702 bytes)

View attachment "reproduce" of type "text/plain" (726 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ