lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210317142621.GC28839@xsang-OptiPlex-9020>
Date:   Wed, 17 Mar 2021 22:26:21 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Pavel Begunkov <asml.silence@...il.com>
Cc:     Jens Axboe <axboe@...nel.dk>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [io_uring]  7a612350a9:  fio.read_iops -6.5% regression



Greeting,

FYI, we noticed a -6.5% regression of fio.read_iops due to commit:


commit: 7a612350a989866510dc5c874fd8ffe1f37555d2 ("io_uring: fix complete_post races for linked req")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: ext2
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: read
	bs: 2M
	ioengine: mmap
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  2M/gcc-9/performance/2pmem/ext2/mmap/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5003006

commit: 
  33cc89a9fc ("io_uring: add io_disarm_next() helper")
  7a612350a9 ("io_uring: fix complete_post races for linked req")

33cc89a9fc248a48 7a612350a989866510dc5c874fd 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     96.76           -88.0        8.72 ± 43%  fio.latency_10ms%
      3.22 ± 15%     +88.0       91.27 ±  4%  fio.latency_20ms%
     10129            -6.5%       9473        fio.read_bw_MBps
   9590101            +7.5%   10310997        fio.read_clat_90%_us
   9743018            +7.4%   10463914        fio.read_clat_95%_us
  10223616            +6.8%   10922666        fio.read_clat_99%_us
   9475384            +6.9%   10131741        fio.read_clat_mean_us
      5064            -6.5%       4736        fio.read_iops
 5.187e+08            -6.5%  4.851e+08        fio.time.minor_page_faults
    466.31            -7.2%     432.57        fio.time.user_time
   1013022            -6.5%     947398        fio.workload
   2491451 ± 20%     -27.3%    1812105 ±  7%  cpuidle.C1.time
      2.40            -7.1%       2.23        iostat.cpu.user
      2276 ±  8%     +13.6%       2584 ±  6%  slabinfo.task_group.active_objs
      2276 ±  8%     +13.6%       2584 ±  6%  slabinfo.task_group.num_objs
      4195 ± 15%     +43.6%       6023 ± 31%  softirqs.CPU36.RCU
     17509 ± 37%     -65.8%       5991 ± 39%  softirqs.CPU36.SCHED
     13188 ± 50%     +88.4%      24850 ±  9%  softirqs.CPU84.SCHED
    501094            -6.5%     468624        proc-vmstat.nr_page_table_pages
      2291 ± 66%    +134.5%       5374 ± 31%  proc-vmstat.nr_written
   3230950            -3.9%    3106141        proc-vmstat.numa_hit
   3144164            -4.0%    3019355        proc-vmstat.numa_local
    494753 ±  2%      -5.5%     467629 ±  4%  proc-vmstat.numa_pte_updates
   3402336            -3.6%    3280876        proc-vmstat.pgalloc_normal
   5.2e+08            -6.5%  4.864e+08        proc-vmstat.pgfault
   3042976            -4.2%    2915839        proc-vmstat.pgfree
      0.42 ±  9%      -0.1        0.36 ± 11%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.48 ± 10%      -0.1        0.41 ± 11%  perf-profile.children.cycles-pp._raw_read_lock
      0.20 ± 11%      -0.0        0.16 ± 13%  perf-profile.children.cycles-pp.grab_mapping_entry
      0.10 ± 13%      -0.0        0.08 ± 14%  perf-profile.children.cycles-pp.get_unlocked_entry
      0.12 ± 10%      -0.0        0.09 ± 13%  perf-profile.children.cycles-pp.dax_iomap_pfn
      0.09 ± 12%      -0.0        0.07 ± 17%  perf-profile.children.cycles-pp.xas_find_conflict
      0.12 ± 11%      -0.0        0.10 ± 12%  perf-profile.children.cycles-pp.xas_store
      0.12 ± 10%      -0.0        0.10 ± 10%  perf-profile.children.cycles-pp.dax_unlock_entry
      0.47 ± 10%      -0.1        0.40 ± 12%  perf-profile.self.cycles-pp._raw_read_lock
      0.42 ±  9%      -0.1        0.36 ± 11%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.08 ± 10%      -0.0        0.06 ± 16%  perf-profile.self.cycles-pp.xas_find_conflict
     37.17 ± 82%    +526.0%     232.67 ± 87%  interrupts.CPU33.TLB:TLB_shootdowns
      4426 ± 41%     +69.1%       7485 ±  9%  interrupts.CPU36.NMI:Non-maskable_interrupts
      4426 ± 41%     +69.1%       7485 ±  9%  interrupts.CPU36.PMI:Performance_monitoring_interrupts
     85.17 ± 65%    +118.6%     186.17 ± 11%  interrupts.CPU36.RES:Rescheduling_interrupts
      4893 ± 48%     +56.5%       7656 ±  4%  interrupts.CPU67.NMI:Non-maskable_interrupts
      4893 ± 48%     +56.5%       7656 ±  4%  interrupts.CPU67.PMI:Performance_monitoring_interrupts
      4248 ± 40%     +59.8%       6788 ± 14%  interrupts.CPU70.NMI:Non-maskable_interrupts
      4248 ± 40%     +59.8%       6788 ± 14%  interrupts.CPU70.PMI:Performance_monitoring_interrupts
      2783 ± 24%    +138.2%       6629 ± 27%  interrupts.CPU76.NMI:Non-maskable_interrupts
      2783 ± 24%    +138.2%       6629 ± 27%  interrupts.CPU76.PMI:Performance_monitoring_interrupts
    573.50 ±  4%     +11.0%     636.83 ±  8%  interrupts.CPU81.CAL:Function_call_interrupts
    129.17 ± 43%     -79.1%      27.00 ± 72%  interrupts.CPU84.RES:Rescheduling_interrupts
     38.00 ± 80%    +151.3%      95.50 ± 47%  interrupts.CPU84.TLB:TLB_shootdowns
     39.33 ± 63%    +127.5%      89.50 ± 31%  interrupts.CPU87.TLB:TLB_shootdowns
     41.33 ± 51%     +93.1%      79.83 ± 23%  interrupts.CPU88.TLB:TLB_shootdowns
     45.33 ± 42%    +113.6%      96.83 ± 30%  interrupts.CPU93.TLB:TLB_shootdowns
     15.18            -3.4%      14.67        perf-stat.i.MPKI
 5.037e+09            -2.9%  4.893e+09        perf-stat.i.branch-instructions
      0.33            -0.0        0.33        perf-stat.i.branch-miss-rate%
  16306932            -4.4%   15591709        perf-stat.i.branch-misses
 2.322e+08            -6.1%   2.18e+08        perf-stat.i.cache-misses
 3.253e+08            -6.2%  3.052e+08        perf-stat.i.cache-references
      6.25            +3.1%       6.44        perf-stat.i.cpi
    582.91            +6.4%     620.25        perf-stat.i.cycles-between-cache-misses
      0.97            -0.0        0.93        perf-stat.i.dTLB-load-miss-rate%
  53274570            -6.9%   49597287        perf-stat.i.dTLB-load-misses
 5.431e+09            -3.1%   5.26e+09        perf-stat.i.dTLB-loads
   1.7e+09            -6.0%  1.598e+09        perf-stat.i.dTLB-stores
     83.93            +2.6       86.55        perf-stat.i.iTLB-load-miss-rate%
  13088877 ±  4%     +15.3%   15097164        perf-stat.i.iTLB-load-misses
   2466124            -6.4%    2307680        perf-stat.i.iTLB-loads
 2.138e+10            -3.0%  2.073e+10        perf-stat.i.instructions
      1647 ±  4%     -16.1%       1381        perf-stat.i.instructions-per-iTLB-miss
      0.16            -2.7%       0.16        perf-stat.i.ipc
    131.32            -3.5%     126.68        perf-stat.i.metric.M/sec
   2571493            -6.5%    2405116        perf-stat.i.minor-faults
  43574923 ± 10%     -19.9%   34910133 ± 12%  perf-stat.i.node-load-misses
  27094070            -7.1%   25159107        perf-stat.i.node-stores
   2572882            -6.5%    2406509        perf-stat.i.page-faults
     15.22            -3.3%      14.72        perf-stat.overall.MPKI
      6.26            +3.1%       6.46        perf-stat.overall.cpi
    576.54            +6.5%     614.09        perf-stat.overall.cycles-between-cache-misses
      0.97            -0.0        0.93        perf-stat.overall.dTLB-load-miss-rate%
     84.13            +2.6       86.74        perf-stat.overall.iTLB-load-miss-rate%
      1635 ±  4%     -16.0%       1373        perf-stat.overall.instructions-per-iTLB-miss
      0.16            -3.0%       0.15        perf-stat.overall.ipc
   4258659            +3.6%    4412636        perf-stat.overall.path-length
 5.011e+09            -2.9%  4.868e+09        perf-stat.ps.branch-instructions
  16226538            -4.4%   15516193        perf-stat.ps.branch-misses
  2.31e+08            -6.1%  2.169e+08        perf-stat.ps.cache-misses
 3.237e+08            -6.2%  3.036e+08        perf-stat.ps.cache-references
  53008395            -6.9%   49349734        perf-stat.ps.dTLB-load-misses
 5.404e+09            -3.1%  5.234e+09        perf-stat.ps.dTLB-loads
 1.692e+09            -6.0%   1.59e+09        perf-stat.ps.dTLB-stores
  13029271 ±  4%     +15.3%   15021875        perf-stat.ps.iTLB-load-misses
   2453727            -6.4%    2295986        perf-stat.ps.iTLB-loads
 2.127e+10            -3.0%  2.063e+10        perf-stat.ps.instructions
   2558645            -6.5%    2393171        perf-stat.ps.minor-faults
  43355231 ± 10%     -19.9%   34734698 ± 12%  perf-stat.ps.node-load-misses
  26960382            -7.1%   25037880        perf-stat.ps.node-stores
   2560034            -6.5%    2394561        perf-stat.ps.page-faults
 4.314e+12            -3.1%   4.18e+12        perf-stat.total.instructions


                                                                                
                                  fio.read_bw_MBps                              
                                                                                
  10200 +-------------------------------------------------------------------+   
        |  .+.+. .+   +.   .+.+.     .+.+. +  +.+.+.   .+.+.+               |   
  10100 |.+     +       +.+     +.+.+     +         +.+                     |   
  10000 |-+                                                                 |   
        |                                                                   |   
   9900 |-+                                                                 |   
        |                                                                   |   
   9800 |-+                                                                 |   
        |                                                                   |   
   9700 |-+                                                                 |   
   9600 |-+                                                                 |   
        |                                                                   |   
   9500 |-O O O O           O   O   O     O   O O   O   O O O   O O   O     |   
        |             O O O   O   O   O O   O     O   O       O     O   O O |   
   9400 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                   fio.read_iops                                
                                                                                
  5100 +--------------------------------------------------------------------+   
       |  .+.+. .+   +.   .+.+.     .+.+.. +  +.+.+.   .+.+.+               |   
  5050 |.+     +       +.+     +.+.+      +         +.+                     |   
  5000 |-+                                                                  |   
       |                                                                    |   
  4950 |-+                                                                  |   
       |                                                                    |   
  4900 |-+                                                                  |   
       |                                                                    |   
  4850 |-+                                                                  |   
  4800 |-+                                                                  |   
       |                                                                    |   
  4750 |-O O O O           O   O   O      O   O O   O   O O O   O O   O     |   
       |             O O O   O   O   O O    O     O   O       O     O   O O |   
  4700 +--------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_mean_us                          
                                                                                
  1.02e+07 +----------------------------------------------------------------+   
           |            O O O   O   O   O O  O O   O O O       O  O O   O O |   
  1.01e+07 |-O O O O          O   O   O     O    O       O O O   O    O     |   
           |                                                                |   
     1e+07 |-+                                                              |   
   9.9e+06 |-+                                                              |   
           |                                                                |   
   9.8e+06 |-+                                                              |   
           |                                                                |   
   9.7e+06 |-+                                                              |   
   9.6e+06 |-+                                                              |   
           |                                                                |   
   9.5e+06 |.+.   .+.    .+.+.   .+.+.+.   .+       .+.+.                   |   
           |   +.+   ++.+     +.+       +.+  +.+.+.+     +.+.+              |   
   9.4e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_90__us                           
                                                                                
  1.05e+07 +----------------------------------------------------------------+   
  1.04e+07 |-+                                                              |   
           |                                                                |   
  1.03e+07 |-+   O   OO O O O O O O O O O O OO O O O O O O O O O OO O O O O |   
  1.02e+07 |-+                                                              |   
           | O O   O                                                        |   
  1.01e+07 |-+                                                              |   
     1e+07 |-+                                                              |   
   9.9e+06 |-+                                                              |   
           |                                                                |   
   9.8e+06 |-+                                                              |   
   9.7e+06 |-+                                                              |   
           |.+.+.+.+.+  +.+.+.+.+.+.+.+.+.+.+  +.+   +.+.+.+                |   
   9.6e+06 |-+        :+                     :+   + +       +               |   
   9.5e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_95__us                           
                                                                                
  1.06e+07 +----------------------------------------------------------------+   
  1.05e+07 |-+        O     O                          O                  O |   
           |                                                                |   
  1.04e+07 |-O O O O O  O O   O O O O O O O OO O O O O   O O O O OO O O O   |   
  1.03e+07 |-+                                                              |   
           |                                                                |   
  1.02e+07 |-+                                                              |   
  1.01e+07 |-+                                                              |   
     1e+07 |-+                                                              |   
           |                                                                |   
   9.9e+06 |-+     +              +         +                               |   
   9.8e+06 |-+    + +            + +       + :                              |   
           |.+.+.+   +  +.+.+.+.+   +.+.+.+  +.+.+.+.+.+.+.+                |   
   9.7e+06 |-+        :+                                    +               |   
   9.6e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_99__us                           
                                                                                
   1.1e+07 +----------------------------------------------------------------+   
  1.09e+07 |-+       OO O O O O O   O   O O  O     O   O O   O O  O O O   O |   
           |                                                                |   
  1.08e+07 |-O O O O              O   O     O  O O   O     O     O      O   |   
  1.07e+07 |-+                                                              |   
           |                                                                |   
  1.06e+07 |-+                                                              |   
  1.05e+07 |-+                                                              |   
  1.04e+07 |-+                                                              |   
           |                                                                |   
  1.03e+07 |.+.+.+.+.++.+.+.+.+.+.+.+.+.+.+.+  +.+.+   +   +                |   
  1.02e+07 |-+                              :  :    + + + + +               |   
           |                                 ::      +   +   +              |   
  1.01e+07 |-+                               ::                             |   
     1e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.latency_10ms_                              
                                                                                
  100 +---------------------------------------------------------------------+   
   90 |-+     +          +     +   +     +           +   +.                 |   
      |                                                                     |   
   80 |-+                                                                   |   
   70 |-+                                                                   |   
      |                                                                     |   
   60 |-+                                                                   |   
   50 |-+                                                                   |   
   40 |-+                                                                   |   
      |                                                                     |   
   30 |-+                                                                   |   
   20 |-+ O   O                                                             |   
      | O   O              O   O                                O O   O     |   
   10 |-+           O O  O   O   O O O O O O O O O O O O O  O O     O   O O |   
    0 +---------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.latency_20ms_                              
                                                                                
  100 +---------------------------------------------------------------------+   
   90 |-+           O O  O   O   O O O O O O O O O O O O O  O O     O   O O |   
      | O   O              O   O                                O O   O     |   
   80 |-+ O   O                                                             |   
   70 |-+                                                                   |   
      |                                                                     |   
   60 |-+                                                                   |   
   50 |-+                                                                   |   
   40 |-+                                                                   |   
      |                                                                     |   
   30 |-+                                                                   |   
   20 |-+                                                                   |   
      |                                                                     |   
   10 |.+.   .+.        .+.   .+. .+.   .+.         .+. .+..                |   
    0 +---------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                     fio.workload                               
                                                                                
  1.02e+06 +----------------------------------------------------------------+   
           |  .+.+. .+  +.   .+.+.     .+.+. : +.+.+.   .+.+.+              |   
  1.01e+06 |.+     +      +.+     +.+.+     +        +.+                    |   
     1e+06 |-+                                                              |   
           |                                                                |   
    990000 |-+                                                              |   
           |                                                                |   
    980000 |-+                                                              |   
           |                                                                |   
    970000 |-+                                                              |   
    960000 |-+                                                              |   
           |                                                                |   
    950000 |-O O O O          O   O   O     O  O O   O   O O O   OO   O     |   
           |            O O O   O   O   O O  O     O   O       O    O   O O |   
    940000 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                              fio.time.minor_page_faults                        
                                                                                
  5.25e+08 +----------------------------------------------------------------+   
           |          +.                     +.   .+.       .+              |   
   5.2e+08 |.+.+.+. .+  +.+.+.+.+. .+.+.+.+. : +.+   +. .+.+                |   
  5.15e+08 |-+     +              +         +          +                    |   
           |                                                                |   
   5.1e+08 |-+                                                              |   
  5.05e+08 |-+                                                              |   
           |                                                                |   
     5e+08 |-+                                                              |   
  4.95e+08 |-+                                                              |   
           |                                                                |   
   4.9e+08 |-+                                                              |   
  4.85e+08 |-O O O O          O O O O O   O O  O O   O   O O O   OO   O O   |   
           |         OO O O O           O    O     O   O       O    O     O |   
   4.8e+08 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00020-g7a612350a989" of type "text/plain" (172899 bytes)

View attachment "job-script" of type "text/plain" (8471 bytes)

View attachment "job.yaml" of type "text/plain" (5877 bytes)

View attachment "reproduce" of type "text/plain" (935 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ