lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 13 Sep 2023 23:05:34 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...nel.org>,
        <ying.huang@...el.com>, <feng.tang@...el.com>,
        <fengwei.yin@...el.com>, <aubrey.li@...ux.intel.com>,
        <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [sched/fair]  5e963f2bd4:
 will-it-scale.per_thread_ops 2.5% improvement


hi, Peter Zijlstra,

Yu helped review this report. though maybe not so valueable like those
hackbench/netperf report for EEVDF which has huge performance difference,
we report this just FYI since we got pretty stable results even by rebuilding
kernel and more reruns.


Hello,

kernel test robot noticed a 2.5% improvement of will-it-scale.per_thread_ops on:


commit: 5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: thread
	test: context_switch1
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230913/202309132209.cae4f58a-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/context_switch1/will-it-scale

commit: 
  e8f331bcc2 ("sched/smp: Use lag to simplify cross-runqueue placement")
  5e963f2bd4 ("sched/fair: Commit to EEVDF")

e8f331bcc270354a 5e963f2bd4654a202a8a05aa3a8 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  18121238            +2.5%   18575201        vmstat.system.cs
  18317774            +2.5%   18781349        will-it-scale.104.threads
    176131            +2.5%     180589        will-it-scale.per_thread_ops
  18317774            +2.5%   18781349        will-it-scale.workload
 1.257e+08           -96.7%    4139803        sched_debug.sysctl_sched.sysctl_sched_features
      0.75          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_idle_min_granularity
     24.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_latency
      4.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_wakeup_granularity
      1.65            +0.0        1.68        perf-stat.i.branch-miss-rate%
 4.185e+08            +1.3%   4.24e+08        perf-stat.i.branch-misses
  18284380            +2.5%   18745294        perf-stat.i.context-switches
      0.10            +0.0        0.10        perf-stat.i.dTLB-load-miss-rate%
  37343347            +2.5%   38269096        perf-stat.i.dTLB-load-misses
 3.711e+10            -1.1%  3.671e+10        perf-stat.i.dTLB-loads
 2.231e+10            -1.0%  2.208e+10        perf-stat.i.dTLB-stores
     60.89           +15.4       76.32        perf-stat.i.iTLB-load-miss-rate%
  42744641 ±  3%     +60.6%   68665465 ±  3%  perf-stat.i.iTLB-load-misses
  27283919           -21.7%   21361180 ±  2%  perf-stat.i.iTLB-loads
      3211 ±  3%     -37.7%       2001 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      0.10            +0.0        0.10        perf-stat.overall.dTLB-load-miss-rate%
     61.02           +15.2       76.26        perf-stat.overall.iTLB-load-miss-rate%
      3060 ±  3%     -38.2%       1890 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
   2146287            -3.2%    2077784        perf-stat.overall.path-length
 4.171e+08            +1.3%  4.226e+08        perf-stat.ps.branch-misses
  18221874            +2.5%   18680885        perf-stat.ps.context-switches
  37218153            +2.5%   38141041        perf-stat.ps.dTLB-load-misses
 3.699e+10            -1.1%  3.659e+10        perf-stat.ps.dTLB-loads
 2.223e+10            -1.0%  2.201e+10        perf-stat.ps.dTLB-stores
  42595400 ±  3%     +60.6%   68425583 ±  3%  perf-stat.ps.iTLB-load-misses
  27192032           -21.7%   21288405 ±  2%  perf-stat.ps.iTLB-loads
     25.66            -1.0       24.68        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     27.29            -0.8       26.46        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     38.80            -0.8       38.02        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
     35.13            -0.8       34.36        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
     15.91            -0.6       15.26        perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
     15.44            -0.6       14.79        perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
     22.78            -0.6       22.19        perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.33            -0.5        8.81        perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
      3.70            -0.4        3.29        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read
      1.41            -0.4        1.03 ±  2%  perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
      1.66            -0.3        1.36 ±  2%  perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
      7.52            -0.3        7.24        perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
      7.31            -0.3        7.06        perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
      1.31 ±  2%      -0.2        1.13 ±  2%  perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.pipe_read.vfs_read
      1.57            -0.2        1.40 ±  3%  perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up
      0.71 ±  3%      -0.1        0.62 ±  4%  perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.enqueue_task_fair.activate_task.ttwu_do_activate
      0.71 ±  3%      -0.1        0.63 ±  3%  perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.dequeue_task_fair.__schedule.schedule
      0.84            -0.0        0.80        perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.pipe_read
      0.87            -0.0        0.84        perf-profile.calltrace.cycles-pp.place_entity.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
      0.84 ±  2%      +0.1        0.91 ±  2%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read
      1.08            +0.1        1.16 ±  2%  perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.69            +0.1        0.76        perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__schedule.schedule
      1.43            +0.1        1.52        perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.pipe_read
      1.02 ±  3%      +0.1        1.12 ±  2%  perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.04 ±  3%      +0.1        1.15 ±  2%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
      1.29 ±  4%      +0.2        1.48 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__libc_read
     25.69            -1.0       24.71        perf-profile.children.cycles-pp.vfs_read
     27.34            -0.8       26.52        perf-profile.children.cycles-pp.ksys_read
     22.95            -0.6       22.38        perf-profile.children.cycles-pp.pipe_read
     17.97            -0.5       17.44        perf-profile.children.cycles-pp.__schedule
      9.38            -0.5        8.86        perf-profile.children.cycles-pp.ttwu_do_activate
     18.42            -0.5       17.90        perf-profile.children.cycles-pp.schedule
      5.42            -0.4        5.04        perf-profile.children.cycles-pp.pick_next_task_fair
      1.43            -0.3        1.09 ±  2%  perf-profile.children.cycles-pp.check_preempt_wakeup
      1.67            -0.3        1.38 ±  2%  perf-profile.children.cycles-pp.check_preempt_curr
      3.09 ±  2%      -0.3        2.81 ±  2%  perf-profile.children.cycles-pp.reweight_entity
      7.53            -0.3        7.26        perf-profile.children.cycles-pp.activate_task
      7.33            -0.2        7.08        perf-profile.children.cycles-pp.enqueue_task_fair
      1.68 ±  2%      -0.2        1.46 ±  2%  perf-profile.children.cycles-pp.prepare_task_switch
      0.35 ±  2%      -0.2        0.17 ±  4%  perf-profile.children.cycles-pp.pick_next_entity
      0.52 ± 10%      -0.1        0.39 ± 11%  perf-profile.children.cycles-pp.cpuacct_charge
      0.88 ±  2%      -0.1        0.77 ±  3%  perf-profile.children.cycles-pp.__calc_delta
      0.41 ±  6%      -0.1        0.31 ±  9%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.78 ±  2%      -0.1        0.70 ±  2%  perf-profile.children.cycles-pp.put_prev_entity
      0.57            -0.0        0.52        perf-profile.children.cycles-pp.__cond_resched
      0.24 ±  2%      -0.0        0.20 ±  3%  perf-profile.children.cycles-pp.__list_add_valid
      1.29            -0.0        1.25        perf-profile.children.cycles-pp.mutex_lock
      0.41 ±  3%      -0.0        0.37 ±  3%  perf-profile.children.cycles-pp.__list_del_entry_valid
      0.27 ±  2%      -0.0        0.23 ±  2%  perf-profile.children.cycles-pp.check_cfs_rq_runtime
      0.53            -0.0        0.50 ±  2%  perf-profile.children.cycles-pp.copyout
      0.50 ±  2%      -0.0        0.47 ±  2%  perf-profile.children.cycles-pp.__pthread_disable_asynccancel
      0.20 ±  4%      -0.0        0.18 ±  5%  perf-profile.children.cycles-pp.inode_needs_update_time
      0.12 ±  4%      -0.0        0.10 ±  4%  perf-profile.children.cycles-pp.kill_fasync
      0.21 ±  4%      +0.0        0.24 ±  2%  perf-profile.children.cycles-pp.__x64_sys_write
      0.19 ±  4%      +0.0        0.22 ±  5%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      0.08 ±  6%      +0.0        0.12 ±  5%  perf-profile.children.cycles-pp.__rb_insert_augmented
      0.99            +0.0        1.03        perf-profile.children.cycles-pp.__switch_to
      1.33            +0.1        1.39        perf-profile.children.cycles-pp.__update_load_avg_se
      0.00            +0.1        0.06 ±  8%  perf-profile.children.cycles-pp.make_vfsgid
      0.26 ±  3%      +0.1        0.32 ±  3%  perf-profile.children.cycles-pp.finish_task_switch
      0.54 ±  2%      +0.1        0.60 ±  2%  perf-profile.children.cycles-pp.fput
      0.86 ±  2%      +0.1        0.93 ±  2%  perf-profile.children.cycles-pp.atime_needs_update
      0.50            +0.1        0.57 ±  2%  perf-profile.children.cycles-pp.__dequeue_entity
      1.09            +0.1        1.17 ±  2%  perf-profile.children.cycles-pp.touch_atime
      1.82            +0.1        1.93        perf-profile.children.cycles-pp.set_next_entity
      2.03 ±  3%      +0.1        2.17        perf-profile.children.cycles-pp.__fget_light
      2.10 ±  3%      +0.1        2.24        perf-profile.children.cycles-pp.__fdget_pos
      4.33            +0.2        4.55        perf-profile.children.cycles-pp.update_load_avg
      1.48            -0.3        1.14        perf-profile.self.cycles-pp.vfs_read
      0.60 ±  6%      -0.2        0.42 ±  7%  perf-profile.self.cycles-pp.prepare_task_switch
      0.71            -0.2        0.54 ±  2%  perf-profile.self.cycles-pp.check_preempt_wakeup
      0.50 ± 10%      -0.1        0.38 ± 11%  perf-profile.self.cycles-pp.cpuacct_charge
      0.87 ±  2%      -0.1        0.76 ±  3%  perf-profile.self.cycles-pp.__calc_delta
      0.62 ±  4%      -0.1        0.53        perf-profile.self.cycles-pp.dequeue_entity
      0.39 ±  7%      -0.1        0.30 ±  9%  perf-profile.self.cycles-pp.switch_mm_irqs_off
      0.32 ±  3%      -0.1        0.23 ±  2%  perf-profile.self.cycles-pp.put_prev_entity
      1.37            -0.1        1.30 ±  2%  perf-profile.self.cycles-pp.pick_next_task_fair
      0.58 ±  3%      -0.1        0.52 ±  3%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.76 ±  3%      -0.1        0.71 ±  2%  perf-profile.self.cycles-pp.__libc_read
      0.40 ±  2%      -0.0        0.36 ±  2%  perf-profile.self.cycles-pp.__cond_resched
      0.23 ±  2%      -0.0        0.19 ±  3%  perf-profile.self.cycles-pp.__list_add_valid
      0.18 ±  3%      -0.0        0.14 ±  4%  perf-profile.self.cycles-pp.check_cfs_rq_runtime
      0.36 ±  2%      -0.0        0.33 ±  2%  perf-profile.self.cycles-pp.copyout
      0.19 ±  3%      -0.0        0.16 ±  5%  perf-profile.self.cycles-pp.activate_task
      0.45 ±  2%      -0.0        0.42 ±  3%  perf-profile.self.cycles-pp.__pthread_disable_asynccancel
      0.19 ±  3%      -0.0        0.17 ±  4%  perf-profile.self.cycles-pp.inode_needs_update_time
      0.10 ±  5%      -0.0        0.08 ±  4%  perf-profile.self.cycles-pp.kill_fasync
      0.12 ±  5%      +0.0        0.14 ±  5%  perf-profile.self.cycles-pp.exit_to_user_mode_loop
      0.34 ±  3%      +0.0        0.37 ±  3%  perf-profile.self.cycles-pp.ksys_write
      0.18 ±  3%      +0.0        0.22 ±  5%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
      0.08 ±  5%      +0.0        0.11 ±  4%  perf-profile.self.cycles-pp.__rb_insert_augmented
      0.08 ±  7%      +0.0        0.12 ±  3%  perf-profile.self.cycles-pp.rb_next
      0.90            +0.0        0.95        perf-profile.self.cycles-pp.__switch_to
      0.39 ±  2%      +0.0        0.43 ±  2%  perf-profile.self.cycles-pp.__dequeue_entity
      0.22 ±  2%      +0.0        0.26 ±  2%  perf-profile.self.cycles-pp.ttwu_do_activate
      1.30            +0.1        1.35        perf-profile.self.cycles-pp.__update_load_avg_se
      0.46 ±  2%      +0.1        0.51 ±  2%  perf-profile.self.cycles-pp.fput
      0.00            +0.1        0.06 ±  8%  perf-profile.self.cycles-pp.make_vfsgid
      0.19 ±  4%      +0.1        0.24 ±  4%  perf-profile.self.cycles-pp.finish_task_switch
      0.38 ±  3%      +0.1        0.47 ±  3%  perf-profile.self.cycles-pp.ksys_read
      1.43 ±  2%      +0.1        1.54 ±  2%  perf-profile.self.cycles-pp.pipe_write
      1.05 ±  2%      +0.1        1.17 ±  2%  perf-profile.self.cycles-pp.vfs_write
      2.00 ±  3%      +0.1        2.14        perf-profile.self.cycles-pp.__fget_light
      1.71 ±  2%      +0.1        1.86 ±  2%  perf-profile.self.cycles-pp.update_load_avg



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ