lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210406045929.GB7352@xsang-OptiPlex-9020>
Date:   Tue, 6 Apr 2021 12:59:29 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Eric Whitney <enwlinux@...il.com>
Cc:     Theodore Ts'o <tytso@....edu>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [ext4]  efc6134527:  aim7.jobs-per-min -6.8% regression



Greeting,

FYI, we noticed a -6.8% regression of aim7.jobs-per-min due to commit:


commit: efc61345274d6c7a46a0570efbc916fcbe3e927b ("ext4: shrink race window in ext4_should_retry_alloc()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: aim7
on test machine: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
with following parameters:

	disk: 1BRD_48G
	fs: ext4
	test: creat-clo
	load: 1000
	cpufreq_governor: performance
	ucode: 0x700001e

test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/1BRD_48G/ext4/x86_64-rhel-8.3/1000/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/creat-clo/aim7/0x700001e

commit: 
  v5.12-rc2
  efc6134527 ("ext4: shrink race window in ext4_should_retry_alloc()")

       v5.12-rc2 efc61345274d6c7a46a0570efbc 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      9766            -6.8%       9103        aim7.jobs-per-min
    614.48            +7.3%     659.28        aim7.time.elapsed_time
    614.48            +7.3%     659.28        aim7.time.elapsed_time.max
   1413448            +6.7%    1507983        aim7.time.involuntary_context_switches
     80980            +7.9%      87401        aim7.time.system_time
      7.78            -6.8%       7.25        iostat.cpu.idle
      1336            -6.4%       1251        vmstat.io.bo
     10609            -4.7%      10108        vmstat.system.cs
      1.11 ± 95%     -90.8%       0.10 ±206%  perf-sched.sch_delay.avg.ms.preempt_schedule_common.__cond_resched.dput.terminate_walk.path_openat
      0.03 ± 37%    +365.6%       0.14 ± 71%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.01 ± 72%  +16200.0%       1.25 ±211%  perf-sched.sch_delay.max.ms.preempt_schedule_common.__cond_resched.ext4_journal_check_start.__ext4_journal_start_sb.ext4_ext_remove_space
      0.56 ± 19%     -39.2%       0.34 ± 17%  sched_debug.cfs_rq:/.load_avg.min
  36889619            +8.1%   39874296 ±  4%  sched_debug.cfs_rq:/.min_vruntime.avg
  37479588            +8.1%   40514969 ±  4%  sched_debug.cfs_rq:/.min_vruntime.max
  34455004            +8.1%   37252165 ±  4%  sched_debug.cfs_rq:/.min_vruntime.min
    165.11 ±  9%     -22.2%     128.44 ±  6%  sched_debug.cfs_rq:/.runnable_avg.min
   2979282 ±  4%      +7.7%    3207529 ±  5%  sched_debug.cfs_rq:/.spread0.max
    153.41 ± 10%     -28.7%     109.41 ±  8%  sched_debug.cfs_rq:/.util_avg.min
    241.33            +3.9%     250.67        proc-vmstat.nr_active_file
    241.33            +3.9%     250.67        proc-vmstat.nr_zone_active_file
   2448715            +4.5%    2558105        proc-vmstat.numa_hit
   2253807            +4.9%    2363202        proc-vmstat.numa_local
   2663735            +4.0%    2768966        proc-vmstat.pgalloc_normal
   2930123            +5.8%    3099293        proc-vmstat.pgfault
   2238600            +4.7%    2344785        proc-vmstat.pgfree
    129935            +8.5%     141043        proc-vmstat.pgreuse
    499474            +5.3%     525955        interrupts.CAL:Function_call_interrupts
      1919 ±  3%     +11.1%       2132 ±  3%  interrupts.CPU0.RES:Rescheduling_interrupts
      1968 ±  2%     +11.4%       2192 ±  2%  interrupts.CPU11.RES:Rescheduling_interrupts
      1791 ±  4%      +6.0%       1899 ±  2%  interrupts.CPU122.RES:Rescheduling_interrupts
      3244 ±  8%     +11.3%       3611 ±  3%  interrupts.CPU130.CAL:Function_call_interrupts
      3224 ±  7%     +11.7%       3603 ±  3%  interrupts.CPU136.CAL:Function_call_interrupts
      1974 ±  2%     +10.0%       2172 ±  3%  interrupts.CPU14.RES:Rescheduling_interrupts
      3218 ±  7%     +12.0%       3603 ±  4%  interrupts.CPU141.CAL:Function_call_interrupts
      1993 ±  3%     +10.4%       2201 ±  4%  interrupts.CPU17.RES:Rescheduling_interrupts
      1924 ±  2%     +12.6%       2167 ±  3%  interrupts.CPU2.RES:Rescheduling_interrupts
      1944 ±  3%     +11.8%       2174 ±  3%  interrupts.CPU3.RES:Rescheduling_interrupts
      1982 ±  3%     +10.0%       2181 ±  3%  interrupts.CPU4.RES:Rescheduling_interrupts
      1991 ±  3%      +8.7%       2164 ±  3%  interrupts.CPU6.RES:Rescheduling_interrupts
      3485 ±  5%     +11.4%       3881 ±  4%  interrupts.CPU60.CAL:Function_call_interrupts
      1986 ±  3%     +10.4%       2194 ±  2%  interrupts.CPU7.RES:Rescheduling_interrupts
      1789 ±  4%     +13.6%       2033 ±  3%  interrupts.CPU73.RES:Rescheduling_interrupts
      1868 ±  3%     +10.3%       2060 ±  2%  interrupts.CPU76.RES:Rescheduling_interrupts
      1904 ±  2%      +9.4%       2082 ±  3%  interrupts.CPU77.RES:Rescheduling_interrupts
      1912 ±  2%     +11.5%       2131 ±  2%  interrupts.CPU78.RES:Rescheduling_interrupts
      1880 ±  5%     +12.4%       2113 ±  2%  interrupts.CPU79.RES:Rescheduling_interrupts
      1906 ±  3%     +12.0%       2134 ±  2%  interrupts.CPU80.RES:Rescheduling_interrupts
      1910 ±  3%     +12.4%       2146 ±  2%  interrupts.CPU82.RES:Rescheduling_interrupts
      1911 ±  2%     +10.9%       2120 ±  4%  interrupts.CPU83.RES:Rescheduling_interrupts
      1925           +12.1%       2158 ±  3%  interrupts.CPU85.RES:Rescheduling_interrupts
      1935 ±  3%      +9.7%       2123 ±  2%  interrupts.CPU86.RES:Rescheduling_interrupts
      1940 ±  4%     +11.3%       2158 ±  2%  interrupts.CPU87.RES:Rescheduling_interrupts
      1991 ±  3%     +11.2%       2213 ±  3%  interrupts.CPU9.RES:Rescheduling_interrupts
      1694 ±  2%      +9.7%       1858 ±  3%  interrupts.CPU93.RES:Rescheduling_interrupts
  11336532            -3.0%   10999810        perf-stat.i.branch-misses
     62.77            +1.2       64.01        perf-stat.i.cache-miss-rate%
  47260240            +9.2%   51618807 ±  6%  perf-stat.i.cache-misses
     10620            -4.7%      10117        perf-stat.i.context-switches
     11.12            +1.0%      11.23        perf-stat.i.cpi
      2471            -6.1%       2319        perf-stat.i.cpu-migrations
      9149            -7.5%       8464 ±  6%  perf-stat.i.cycles-between-cache-misses
 8.546e+08            -5.9%  8.045e+08        perf-stat.i.dTLB-stores
   5416307            -3.9%    5202887        perf-stat.i.iTLB-load-misses
    606729            -7.0%     564408 ±  4%  perf-stat.i.iTLB-loads
      7161            +3.8%       7436        perf-stat.i.instructions-per-iTLB-miss
      0.09            -1.5%       0.09        perf-stat.i.ipc
      0.79 ± 11%     -21.5%       0.62 ± 12%  perf-stat.i.major-faults
   5810373            +4.1%    6051325        perf-stat.i.node-store-misses
     41799            -4.6%      39889 ±  2%  perf-stat.i.node-stores
      0.14            -0.0        0.14        perf-stat.overall.branch-miss-rate%
     62.99            +1.1       64.12        perf-stat.overall.cache-miss-rate%
      9259            -7.6%       8560 ±  6%  perf-stat.overall.cycles-between-cache-misses
      7192            +3.8%       7468        perf-stat.overall.instructions-per-iTLB-miss
  11296461            -3.0%   10960807        perf-stat.ps.branch-misses
  47194756            +9.2%   51557679 ±  6%  perf-stat.ps.cache-misses
     10603            -4.7%      10102        perf-stat.ps.context-switches
      2468            -6.1%       2317        perf-stat.ps.cpu-migrations
 8.534e+08            -5.8%  8.036e+08        perf-stat.ps.dTLB-stores
   5408407            -3.9%    5196316        perf-stat.ps.iTLB-load-misses
    602701            -6.9%     561024 ±  4%  perf-stat.ps.iTLB-loads
      0.77 ± 10%     -19.7%       0.62 ± 10%  perf-stat.ps.major-faults
   5803928            +4.2%    6045605        perf-stat.ps.node-store-misses
     41864            -4.5%      39964 ±  2%  perf-stat.ps.node-stores
 2.393e+13            +7.0%  2.562e+13        perf-stat.total.instructions
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      1.86 ± 21%      -0.6        1.25 ± 26%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.86 ± 21%      -0.6        1.25 ± 26%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.86 ± 21%      -0.6        1.24 ± 26%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.86 ± 21%      -0.6        1.24 ± 26%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
      1.86 ± 21%      -0.6        1.25 ± 26%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
      1.84 ± 22%      -0.6        1.22 ± 26%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
     46.27            +0.4       46.69        perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.ext4_orphan_add.ext4_truncate.ext4_setattr
     46.66            +0.4       47.09        perf-profile.calltrace.cycles-pp.__mutex_lock.ext4_orphan_add.ext4_truncate.ext4_setattr.notify_change
     47.09            +0.4       47.53        perf-profile.calltrace.cycles-pp.ext4_orphan_add.ext4_truncate.ext4_setattr.notify_change.do_truncate
     95.37            +0.6       95.97        perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.do_sys_open
     95.27            +0.6       95.88        perf-profile.calltrace.cycles-pp.do_truncate.do_open.path_openat.do_filp_open.do_sys_openat2
     94.89            +0.6       95.50        perf-profile.calltrace.cycles-pp.ext4_truncate.ext4_setattr.notify_change.do_truncate.do_open
     95.21            +0.6       95.83        perf-profile.calltrace.cycles-pp.notify_change.do_truncate.do_open.path_openat.do_filp_open
     95.20            +0.6       95.82        perf-profile.calltrace.cycles-pp.ext4_setattr.notify_change.do_truncate.do_open.path_openat
     97.87            +0.6       98.50        perf-profile.calltrace.cycles-pp.creat64
     97.76            +0.6       98.38        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
     97.75            +0.6       98.38        perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
     97.77            +0.6       98.40        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
     97.72            +0.6       98.35        perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
     97.72            +0.6       98.35        perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
     97.75            +0.6       98.38        perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
      1.85 ± 22%      -0.6        1.23 ± 26%  perf-profile.children.cycles-pp.intel_idle
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.children.cycles-pp.cpu_startup_entry
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.children.cycles-pp.do_idle
      1.86 ± 21%      -0.6        1.25 ± 26%  perf-profile.children.cycles-pp.start_secondary
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.children.cycles-pp.cpuidle_enter
      1.87 ± 21%      -0.6        1.26 ± 25%  perf-profile.children.cycles-pp.cpuidle_enter_state
      0.18 ±  2%      -0.0        0.15 ±  3%  perf-profile.children.cycles-pp.__mutex_unlock_slowpath
      0.51 ±  2%      +0.0        0.55 ±  3%  perf-profile.children.cycles-pp.ext4_mark_iloc_dirty
      0.00            +0.1        0.08 ±  8%  perf-profile.children.cycles-pp.ext4_inode_journal_mode
      0.00            +0.1        0.09 ±  4%  perf-profile.children.cycles-pp.ext4_fc_track_inode
     47.13            +0.4       47.57        perf-profile.children.cycles-pp.ext4_orphan_add
     94.77            +0.6       95.36        perf-profile.children.cycles-pp.osq_lock
     95.37            +0.6       95.97        perf-profile.children.cycles-pp.do_open
     95.27            +0.6       95.88        perf-profile.children.cycles-pp.do_truncate
     94.89            +0.6       95.50        perf-profile.children.cycles-pp.ext4_truncate
     95.21            +0.6       95.83        perf-profile.children.cycles-pp.notify_change
     95.20            +0.6       95.82        perf-profile.children.cycles-pp.ext4_setattr
     97.88            +0.6       98.50        perf-profile.children.cycles-pp.creat64
     97.75            +0.6       98.38        perf-profile.children.cycles-pp.do_sys_open
     97.75            +0.6       98.38        perf-profile.children.cycles-pp.do_sys_openat2
     97.72            +0.6       98.35        perf-profile.children.cycles-pp.do_filp_open
     97.72            +0.6       98.35        perf-profile.children.cycles-pp.path_openat
     97.95            +0.6       98.59        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     97.87            +0.6       98.51        perf-profile.children.cycles-pp.do_syscall_64
      1.85 ± 22%      -0.6        1.23 ± 26%  perf-profile.self.cycles-pp.intel_idle
      0.13 ±  2%      -0.0        0.10 ±  6%  perf-profile.self.cycles-pp.ext4_reserve_inode_write
      0.15 ±  3%      -0.0        0.13 ±  2%  perf-profile.self.cycles-pp.ext4_mark_iloc_dirty
      0.08 ±  4%      -0.0        0.06        perf-profile.self.cycles-pp.__mutex_unlock_slowpath
      0.07            -0.0        0.06        perf-profile.self.cycles-pp.mutex_lock
      0.18 ±  2%      +0.0        0.19 ±  2%  perf-profile.self.cycles-pp.__mutex_lock
      0.05 ±  8%      +0.0        0.08 ± 12%  perf-profile.self.cycles-pp.ext4_orphan_add
      0.00            +0.1        0.08 ±  8%  perf-profile.self.cycles-pp.ext4_inode_journal_mode
     94.04            +0.6       94.66        perf-profile.self.cycles-pp.osq_lock


                                                                                
                                  aim7.jobs-per-min                             
                                                                                
  10000 +-------------------------------------------------------------------+   
   9900 |-+        +.             +                      .+            .+   |   
        | +.       : +.           :+                +   +  :     +   +.  :  |   
   9800 |++ +..   :    +..       :  +    +         + + +    :   + + +    :  |   
   9700 |-+    +. :       +. .+  :   +  : +       +   +     : .+   +      :.|   
   9600 |-+      +     O    +  : :    + :  +.   ..           +            + |   
   9500 |-+    O               : :     +     +.+                            |   
        |          O            :              O                            |   
   9400 |-+                 O O + O  O     O        O                       |   
   9300 |-+                                                                 |   
   9200 |-+ O        O                 O                                    |   
   9100 |-O                              O        O     O                   |   
        |        O                           O        O        O            |   
   9000 |-+               O     O                         O  O              |   
   8900 +-------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00001-gefc61345274d" of type "text/plain" (172899 bytes)

View attachment "job-script" of type "text/plain" (8015 bytes)

View attachment "job.yaml" of type "text/plain" (5575 bytes)

View attachment "reproduce" of type "text/plain" (704 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ