lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202309051653.1dce02c8-oliver.sang@intel.com>
Date:   Wed, 6 Sep 2023 09:18:21 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>,
        Doug Smythies <dsmythies@...us.net>,
        <linux-pm@...r.kernel.org>, <ying.huang@...el.com>,
        <feng.tang@...el.com>, <fengwei.yin@...el.com>,
        <oliver.sang@...el.com>
Subject: [linus:master] [cpuidle]  5484e31bbb:
 adrestia.wakeup_cost_periodic_us -33.3% improvement



Hello,

kernel test robot noticed a -33.3% improvement of adrestia.wakeup_cost_periodic_us on:


commit: 5484e31bbbff285f9505c4766373f840ffb746e5 ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: adrestia
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (Haswell) with 8G memory
parameters:

	nr_threads: 100
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230905/202309051653.1dce02c8-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase:
  gcc-12/performance/x86_64-rhel-8.3/100/debian-11.1-x86_64-20220510.cgz/lkp-hsw-d04/adrestia

commit: 
  2662342079 ("cpuidle: teo: Gather statistics regarding whether or not to stop the tick")
  5484e31bbb ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")

2662342079f54b8a 5484e31bbbff285f9505c476637 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.06 ± 56%     -52.2%       0.03 ± 17%  perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      9603            -0.8%       9529        proc-vmstat.nr_slab_unreclaimable
     12707            -6.7%      11859        vmstat.system.in
      0.91            -0.2        0.74        mpstat.cpu.all.irq%
      0.06            -0.0        0.05        mpstat.cpu.all.soft%
    698830 ± 11%     +18.4%     827374 ±  7%  sched_debug.cpu.avg_idle.max
    222705 ± 10%     +18.7%     264460 ±  5%  sched_debug.cpu.avg_idle.stddev
      0.34 ± 16%     +35.5%       0.47 ± 15%  sched_debug.cpu.clock.stddev
    242150           -79.0%      50912 ±  7%  adrestia.time.involuntary_context_switches
     38.40            -5.7%      36.20        adrestia.time.percent_of_cpu_this_job_got
    138.60            -6.7%     129.34        adrestia.time.system_time
      6.00           -33.3%       4.00        adrestia.wakeup_cost_periodic_us
   5674120          +110.4%   11939267        turbostat.C1
     33.31            +5.8       39.11        turbostat.C1%
   3296324           +24.3%    4096313        turbostat.C1E
      4.70            +6.5       11.18 ±  2%  turbostat.C1E%
   5791021           -48.2%    3001325 ±  2%  turbostat.C3
     17.79            +4.8       22.59        turbostat.C3%
    810414           -85.6%     117003 ±  4%  turbostat.C6
      8.11            -6.8        1.31 ±  5%  turbostat.C6%
   1442211           -52.8%     680532 ±  2%  turbostat.C7s
     23.74            -9.1       14.69 ±  2%  turbostat.C7s%
     61.86           +20.3%      74.43        turbostat.CPU%c1
     14.52 ±  2%     -28.7%      10.35 ±  3%  turbostat.CPU%c3
      3.89           -93.4%       0.26 ± 11%  turbostat.CPU%c6
      7.81 ±  5%     -49.6%       3.94 ±  4%  turbostat.CPU%c7
     10.66            +3.9%      11.08        turbostat.CorWatt
      3.49            -0.5        3.02        turbostat.POLL%
      1.48 ±  2%     -75.0%       0.37 ± 18%  turbostat.Pkg%pc2
     18.29            +5.4%      19.28        turbostat.PkgWatt
     12.72           -16.8%      10.59        perf-stat.i.MPKI
 4.574e+08            -3.8%  4.398e+08        perf-stat.i.branch-instructions
      1.46            -0.1        1.31        perf-stat.i.branch-miss-rate%
   7284139            -8.1%    6690632        perf-stat.i.branch-misses
      2.68            -0.9        1.80        perf-stat.i.cache-miss-rate%
    458457           -41.3%     268958        perf-stat.i.cache-misses
  18029497           -17.1%   14954548        perf-stat.i.cache-references
      2.25            -6.0%       2.11        perf-stat.i.cpi
 3.612e+09            -8.1%  3.318e+09        perf-stat.i.cpu-cycles
      9738           -64.2%       3490 ±  4%  perf-stat.i.cpu-migrations
      9532           +49.8%      14284        perf-stat.i.cycles-between-cache-misses
      0.58 ±  2%      -0.2        0.43 ±  4%  perf-stat.i.dTLB-load-miss-rate%
   1915780           -22.0%    1494726 ±  7%  perf-stat.i.dTLB-load-misses
      0.31 ±  3%      -0.1        0.25 ±  5%  perf-stat.i.dTLB-store-miss-rate%
    611947 ±  2%     -22.2%     476112 ±  3%  perf-stat.i.dTLB-store-misses
  3.11e+08            -4.8%  2.962e+08        perf-stat.i.dTLB-stores
     50.70           -12.1       38.63        perf-stat.i.iTLB-load-miss-rate%
    709004           -14.4%     606971 ±  4%  perf-stat.i.iTLB-load-misses
    641330           +22.2%     783541        perf-stat.i.iTLB-loads
 2.025e+09            -3.2%   1.96e+09        perf-stat.i.instructions
      2772           +23.4%       3420 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      0.48            +5.4%       0.50        perf-stat.i.ipc
      0.45            -8.1%       0.41        perf-stat.i.metric.GHz
    161.95            -3.4%     156.41        perf-stat.i.metric.M/sec
      8.90           -14.3%       7.63        perf-stat.overall.MPKI
      1.59            -0.1        1.52        perf-stat.overall.branch-miss-rate%
      2.54            -0.7        1.80        perf-stat.overall.cache-miss-rate%
      1.78            -5.1%       1.69        perf-stat.overall.cpi
      7878           +56.6%      12335        perf-stat.overall.cycles-between-cache-misses
      0.37            -0.1        0.30 ±  6%  perf-stat.overall.dTLB-load-miss-rate%
      0.20 ±  2%      -0.0        0.16 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
     52.51            -8.9       43.63 ±  2%  perf-stat.overall.iTLB-load-miss-rate%
      2856           +13.2%       3234 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
      0.56            +5.4%       0.59        perf-stat.overall.ipc
 4.565e+08            -3.8%   4.39e+08        perf-stat.ps.branch-instructions
   7270403            -8.1%    6677885        perf-stat.ps.branch-misses
    457602           -41.3%     268449        perf-stat.ps.cache-misses
  17995840           -17.1%   14926426        perf-stat.ps.cache-references
 3.605e+09            -8.1%  3.312e+09        perf-stat.ps.cpu-cycles
      9720           -64.2%       3484 ±  4%  perf-stat.ps.cpu-migrations
   1912206           -22.0%    1491912 ±  7%  perf-stat.ps.dTLB-load-misses
    610806 ±  2%     -22.2%     475217 ±  3%  perf-stat.ps.dTLB-store-misses
 3.104e+08            -4.8%  2.956e+08        perf-stat.ps.dTLB-stores
    707671           -14.4%     605828 ±  4%  perf-stat.ps.iTLB-load-misses
    640132           +22.2%     782070        perf-stat.ps.iTLB-loads
 2.021e+09            -3.2%  1.957e+09        perf-stat.ps.instructions
 1.089e+12            -3.9%  1.047e+12        perf-stat.total.instructions
     28.35 ±  5%      -3.8       24.57 ± 11%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     70.22            -2.0       68.23        perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      2.07 ±  6%      -0.5        1.57 ± 11%  perf-profile.calltrace.cycles-pp.menu_select.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      3.95 ±  5%      -0.4        3.57 ±  3%  perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
      4.02 ±  4%      -0.4        3.65 ±  2%  perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.82 ± 14%      +0.2        1.03 ± 15%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      2.07            +0.2        2.29 ±  9%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.start_thread
      0.86 ± 15%      +0.2        1.08 ± 12%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      3.22 ±  3%      +0.3        3.53 ±  6%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.start_thread
      3.27 ±  3%      +0.3        3.59 ±  6%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write.start_thread
      1.30 ± 12%      +0.4        1.71 ± 12%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      1.41 ± 11%      +0.4        1.84 ± 10%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.21 ±122%      +0.4        0.66 ±  8%  perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64
      4.08 ±  3%      +0.6        4.65 ±  8%  perf-profile.calltrace.cycles-pp.__libc_write.start_thread
     15.14 ±  2%      +0.9       16.08 ±  6%  perf-profile.calltrace.cycles-pp.start_thread
     28.56 ±  5%      -3.8       24.73 ± 11%  perf-profile.children.cycles-pp.poll_idle
     70.19            -2.0       68.19        perf-profile.children.cycles-pp.do_idle
     70.22            -2.0       68.23        perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     70.22            -2.0       68.23        perf-profile.children.cycles-pp.cpu_startup_entry
      2.40 ±  4%      -0.6        1.80 ±  9%  perf-profile.children.cycles-pp.menu_select
      0.74 ± 18%      -0.4        0.33 ± 32%  perf-profile.children.cycles-pp.newidle_balance
      1.47 ±  8%      -0.4        1.05 ± 14%  perf-profile.children.cycles-pp.pick_next_task_fair
      4.13 ±  4%      -0.4        3.72 ±  2%  perf-profile.children.cycles-pp.schedule
      0.76 ± 13%      -0.4        0.38 ± 26%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.58 ± 19%      -0.3        0.29 ± 24%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.55 ± 14%      -0.3        0.26 ± 45%  perf-profile.children.cycles-pp.load_balance
      0.42 ± 23%      -0.2        0.24 ± 32%  perf-profile.children.cycles-pp.tick_nohz_next_event
      0.22 ± 17%      -0.1        0.09 ± 31%  perf-profile.children.cycles-pp.hrtimer_next_event_without
      0.21 ± 31%      -0.1        0.09 ± 65%  perf-profile.children.cycles-pp.select_idle_cpu
      0.16 ± 25%      -0.1        0.06 ± 88%  perf-profile.children.cycles-pp.set_task_cpu
      0.19 ± 47%      -0.1        0.09 ± 42%  perf-profile.children.cycles-pp.leave_mm
      0.27 ± 18%      -0.1        0.18 ± 23%  perf-profile.children.cycles-pp.asm_sysvec_call_function_single
      0.13 ± 32%      -0.1        0.04 ± 90%  perf-profile.children.cycles-pp.__hrtimer_next_event_base
      0.26 ±  3%      -0.1        0.18 ± 13%  perf-profile.children.cycles-pp.tick_nohz_idle_enter
      0.09 ± 26%      +0.1        0.16 ± 28%  perf-profile.children.cycles-pp.clockevents_program_event
      0.44 ±  9%      +0.1        0.56 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.37 ±  3%      +0.2        0.54 ± 11%  perf-profile.children.cycles-pp.mutex_unlock
      0.46 ± 13%      +0.2        0.68 ± 10%  perf-profile.children.cycles-pp.prepare_to_wait_event
      1.63 ±  3%      +0.2        1.87 ±  5%  perf-profile.children.cycles-pp.__entry_text_start
      1.75 ± 12%      +0.3        2.04 ± 11%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      1.87 ± 11%      +0.3        2.19 ±  9%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      1.32 ±  4%      +0.4        1.76 ±  7%  perf-profile.children.cycles-pp.syscall_return_via_sysret
     13.54 ±  2%      +0.6       14.15 ±  2%  perf-profile.children.cycles-pp.__libc_read
     15.14 ±  2%      +0.9       16.08 ±  6%  perf-profile.children.cycles-pp.start_thread
     27.99 ±  6%      -3.7       24.26 ± 11%  perf-profile.self.cycles-pp.poll_idle
      0.56 ± 20%      -0.3        0.28 ± 24%  perf-profile.self.cycles-pp.switch_mm_irqs_off
      1.34 ±  5%      -0.3        1.07 ± 11%  perf-profile.self.cycles-pp.menu_select
      0.24 ± 12%      -0.1        0.18 ±  8%  perf-profile.self.cycles-pp.update_curr
      0.14 ± 12%      -0.0        0.11 ± 10%  perf-profile.self.cycles-pp.touch_atime
      0.03 ±127%      +0.1        0.09 ± 24%  perf-profile.self.cycles-pp.copy_page_from_iter
      0.21 ± 17%      +0.1        0.35 ± 19%  perf-profile.self.cycles-pp.prepare_to_wait_event
      0.44 ± 20%      +0.2        0.59 ± 13%  perf-profile.self.cycles-pp.pipe_write
      0.36 ±  6%      +0.2        0.54 ± 12%  perf-profile.self.cycles-pp.mutex_unlock
      1.42 ±  4%      +0.2        1.60 ±  6%  perf-profile.self.cycles-pp.__entry_text_start
      1.32 ±  4%      +0.4        1.76 ±  6%  perf-profile.self.cycles-pp.syscall_return_via_sysret



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ