lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 27 Apr 2021 15:34:48 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Dennis Zhou <dennis@...nel.org>
Cc:     Roman Gushchin <guro@...com>,
        Pratik Sampat <psampat@...ux.ibm.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [percpu]  ace7e70901:  aim9.sync_disk_rw.ops_per_sec -2.3% regression



Greeting,

FYI, we noticed a -2.3% regression of aim9.sync_disk_rw.ops_per_sec due to commit:


commit: ace7e7090137ee996757eb5eebc94439b0e2803a ("percpu: use reclaim threshold instead of running for every page")
https://git.kernel.org/cgit/linux/kernel/git/dennis/percpu.git for-5.14


in testcase: aim9
on test machine: 256 threads Intel(R) Genuine Intel(R) CPU 0000 @ 1.30GHz with 112G memory
with following parameters:

	testtime: 300s
	test: sync_disk_rw
	cpufreq_governor: performance
	ucode: 0xffff0190

test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/lkp-knl-f1/sync_disk_rw/aim9/300s/0xffff0190

commit: 
  f183324133 ("percpu: implement partial chunk depopulation")
  ace7e70901 ("percpu: use reclaim threshold instead of running for every page")

f183324133ea535d ace7e7090137ee996757eb5eebc 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    103531            -2.3%     101102        aim9.sync_disk_rw.ops_per_sec
     55.21            +3.0%      56.87        aim9.time.user_time
    421.91 ±  9%     +25.1%     527.64 ± 13%  sched_debug.cfs_rq:/.load_avg.max
     39017 ±  4%      -5.8%      36738 ±  4%  softirqs.CPU4.RCU
     39128 ±  4%      -7.3%      36268 ±  4%  softirqs.CPU7.RCU
      0.05 ± 14%     +39.5%       0.07 ± 20%  perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      3970 ±105%    +117.6%       8637 ± 36%  perf-sched.wait_and_delay.max.ms.preempt_schedule_common.__cond_resched.wait_for_completion.affine_move_task.__set_cpus_allowed_ptr
      0.05 ± 77%    +349.6%       0.21 ± 75%  perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.11 ± 84%    +494.5%       0.68 ±130%  perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      3970 ±105%    +117.6%       8637 ± 36%  perf-sched.wait_time.max.ms.preempt_schedule_common.__cond_resched.wait_for_completion.affine_move_task.__set_cpus_allowed_ptr
   8878045            -2.0%    8702700        proc-vmstat.numa_hit
   8878044            -2.0%    8702696        proc-vmstat.numa_local
    515593            -2.3%     503677        proc-vmstat.pgactivate
   8967593            -2.0%    8788379        proc-vmstat.pgalloc_normal
   8941947            -2.0%    8762451        proc-vmstat.pgfree
    396.17 ± 23%     -49.2%     201.14 ±  4%  interrupts.CPU226.NMI:Non-maskable_interrupts
    396.17 ± 23%     -49.2%     201.14 ±  4%  interrupts.CPU226.PMI:Performance_monitoring_interrupts
    266.50 ± 27%     -36.4%     169.43 ±  2%  interrupts.CPU242.NMI:Non-maskable_interrupts
    266.50 ± 27%     -36.4%     169.43 ±  2%  interrupts.CPU242.PMI:Performance_monitoring_interrupts
     92.33 ± 85%     -75.2%      22.86 ± 34%  interrupts.CPU33.RES:Rescheduling_interrupts
     26.83 ± 30%    +130.5%      61.86 ± 61%  interrupts.CPU45.RES:Rescheduling_interrupts
      7.63 ±  2%      +0.4        7.99        perf-stat.i.branch-miss-rate%
     16.71            -0.5       16.18        perf-stat.i.cache-miss-rate%
      8.36            -3.0%       8.11        perf-stat.i.cpi
      7.61 ±  2%      +0.4        7.97        perf-stat.overall.branch-miss-rate%
     16.70            -0.5       16.17        perf-stat.overall.cache-miss-rate%
      8.18            -2.8%       7.95        perf-stat.overall.cpi
      0.09 ±223%      +0.5        0.59 ±  5%  perf-profile.calltrace.cycles-pp.tick_check_broadcast_expired.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      0.07 ±  5%      -0.0        0.05 ± 41%  perf-profile.children.cycles-pp.timerqueue_iterate_next
      0.15 ±  4%      +0.0        0.17 ±  7%  perf-profile.children.cycles-pp.tick_nohz_idle_got_tick
      0.05 ±  9%      +0.0        0.10 ± 17%  perf-profile.children.cycles-pp.cpuidle_get_cpu_driver
      0.05 ±  7%      +0.0        0.10 ± 14%  perf-profile.children.cycles-pp.rcu_irq_exit
      0.23 ± 11%      +0.1        0.30 ±  8%  perf-profile.children.cycles-pp.cpumask_next_and
      0.34 ±  5%      +0.1        0.41 ±  7%  perf-profile.children.cycles-pp.rb_insert_color
      0.40 ±  7%      +0.1        0.49 ± 13%  perf-profile.children.cycles-pp.hrtimer_forward
      0.35 ±  8%      +0.1        0.45 ±  6%  perf-profile.children.cycles-pp.get_cpu_device
      0.47 ±  6%      +0.1        0.59 ±  4%  perf-profile.children.cycles-pp.tick_check_broadcast_expired
      0.60 ±  6%      -0.0        0.55 ±  7%  perf-profile.self.cycles-pp.perf_event_task_tick
      0.26 ±  8%      -0.0        0.22 ±  9%  perf-profile.self.cycles-pp.tick_nohz_irq_exit
      0.19 ±  3%      +0.0        0.23 ±  7%  perf-profile.self.cycles-pp.irqentry_enter
      0.03 ± 70%      +0.1        0.09 ± 16%  perf-profile.self.cycles-pp.cpuidle_get_cpu_driver
      0.32 ±  5%      +0.1        0.39 ±  7%  perf-profile.self.cycles-pp.rb_insert_color
      0.06 ± 47%      +0.1        0.13 ±  9%  perf-profile.self.cycles-pp.cpumask_next_and
      0.00            +0.1        0.08 ± 18%  perf-profile.self.cycles-pp.rcu_irq_exit
      0.34 ±  8%      +0.1        0.44 ±  5%  perf-profile.self.cycles-pp.get_cpu_device
      0.47 ±  5%      +0.1        0.59 ±  5%  perf-profile.self.cycles-pp.tick_check_broadcast_expired
      0.14 ±  9%      +0.1        0.26 ± 10%  perf-profile.self.cycles-pp.rcu_core
      0.77 ±  5%      +0.1        0.92 ±  3%  perf-profile.self.cycles-pp.tick_nohz_next_event


                                                                                
                            aim9.sync_disk_rw.ops_per_sec                       
                                                                                
  105000 +------------------------------------------------------------------+   
         |                                 +                                |   
  104500 |-+                              : :                  +....        |   
  104000 |-+                              : :                 :     +       |   
         |                               :   :         +      :      +     +|   
  103500 |-+                            :    :        : :    :        +   + |   
  103000 |-+          +.               :      :       :  :   :         + +  |   
         |           .  ..             :      :      :   :  :           +   |   
  102500 |-+       ..       ..+..    .+        :    :     : :               |   
  102000 |...+... .       +.     . ..          +... :      +                |   
         |       +            O   +                +                        |   
  101500 |-+              O                                                 |   
  101000 |-+     O                O                                         |   
         |   O        O                                                     |   
  100500 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc7-00006-gace7e7090137" of type "text/plain" (172883 bytes)

View attachment "job-script" of type "text/plain" (7975 bytes)

View attachment "job.yaml" of type "text/plain" (5184 bytes)

View attachment "reproduce" of type "text/plain" (254 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ