lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202412201007.aa43a5fa-lkp@intel.com>
Date: Fri, 20 Dec 2024 10:46:35 +0800
From: kernel test robot <oliver.sang@...el.com>
To: K Prateek Nayak <kprateek.nayak@....com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>, <aubrey.li@...ux.intel.com>,
	<yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [sched/core]  ea9cffc0a1: stream.triad_bandwidth_MBps
 1.1% improvement



Hello,

kernel test robot noticed a 1.1% improvement of stream.triad_bandwidth_MBps on:


commit: ea9cffc0a154124821531991d5afdd7e8b20d7aa ("sched/core: Remove the unnecessary need_resched() check in nohz_csd_func()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stream
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
parameters:

	nr_threads: 50%
	iterations: 10x
	array_size: 50000000
	loop: 100
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241220/202412201007.aa43a5fa-lkp@intel.com

=========================================================================================
array_size/compiler/cpufreq_governor/iterations/kconfig/loop/nr_threads/rootfs/tbox_group/testcase:
  50000000/gcc-12/performance/10x/x86_64-rhel-9.4/100/50%/debian-12-x86_64-20240206.cgz/lkp-skl-d02/stream

commit: 
  6675ce2004 ("softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel")
  ea9cffc0a1 ("sched/core: Remove the unnecessary need_resched() check in nohz_csd_func()")

6675ce20046d149e ea9cffc0a154124821531991d5a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     15264           +23.1%      18793        meminfo.Shmem
      0.02 ±  4%      +0.0        0.03 ±  4%  mpstat.cpu.all.soft%
      3818           +23.1%       4700        proc-vmstat.nr_shmem
    587.28          +302.4%       2363        vmstat.system.cs
      2577            -3.5%       2488        vmstat.system.in
     36673 ±  2%    +164.6%      97051 ±  2%  sched_debug.cpu.nr_switches.avg
     53585 ± 10%    +332.2%     231568 ± 16%  sched_debug.cpu.nr_switches.max
     12003 ± 23%    +578.7%      81463 ± 24%  sched_debug.cpu.nr_switches.stddev
    578.05          +310.5%       2372        perf-stat.i.context-switches
     14.72 ±  4%     +10.8%      16.30        perf-stat.i.cpu-migrations
      0.04 ±  5%    +268.8%       0.15        perf-stat.i.metric.K/sec
    575.63          +310.5%       2363        perf-stat.ps.context-switches
     14.65 ±  4%     +10.8%      16.23        perf-stat.ps.cpu-migrations
     18760            +1.0%      18950        stream.add_bandwidth_MBps
     18759            +1.0%      18948        stream.add_bandwidth_MBps_harmonicMean
     14581            +1.2%      14751        stream.scale_bandwidth_MBps
     14580            +1.2%      14748        stream.scale_bandwidth_MBps_harmonicMean
     18289            +1.1%      18487        stream.triad_bandwidth_MBps
     18287            +1.1%      18484        stream.triad_bandwidth_MBps_harmonicMean
      0.02 ± 12%     -32.3%       0.01 ± 16%  perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.02 ± 42%     -48.6%       0.01 ±  7%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.10 ± 70%    +332.7%       0.44 ± 95%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
     65.81 ±  3%     -68.0%      21.05 ±  3%  perf-sched.total_wait_and_delay.average.ms
      2011          +229.0%       6618 ±  4%  perf-sched.total_wait_and_delay.count.ms
     65.80 ±  3%     -68.0%      21.04 ±  3%  perf-sched.total_wait_time.average.ms
      3.86 ±  2%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    500.54           +24.3%     622.17 ±  5%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
    497.31 ± 14%     -98.6%       6.72 ±  7%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.02 ± 15%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
     19.83 ± 22%    -100.0%       0.00        perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     53.83 ±  9%   +8594.4%       4680 ±  5%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     21.00          -100.0%       0.00        perf-sched.wait_and_delay.count.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      3666 ± 51%     -72.7%       1000        perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      4.04          -100.0%       0.00        perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      1001          +136.0%       2362 ±  8%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.05 ± 37%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
    500.52           +24.3%     622.15 ±  5%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
    497.29 ± 14%     -98.7%       6.71 ±  7%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±165%    +525.0%       0.00 ± 68%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      3666 ± 51%     -72.7%       1000        perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      1001          +136.0%       2362 ±  8%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.01 ±142%    +247.8%       0.04 ± 54%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
     97.56            -0.4       97.12        perf-profile.calltrace.cycles-pp.main
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.calltrace.cycles-pp.common_startup_64
     97.61            -0.4       97.17        perf-profile.children.cycles-pp.main
      0.02 ±141%      +0.1        0.07 ± 14%  perf-profile.children.cycles-pp.poll_idle
      0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.__hrtimer_start_range_ns
      0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.dequeue_entity
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.enqueue_dl_entity
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.dl_server_start
      0.00            +0.1        0.06 ± 17%  perf-profile.children.cycles-pp.hrtimer_start_range_ns
      0.00            +0.1        0.06 ± 21%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.update_load_avg
      0.00            +0.1        0.08 ± 17%  perf-profile.children.cycles-pp.__pick_next_task
      0.00            +0.1        0.10 ± 19%  perf-profile.children.cycles-pp.dequeue_entities
      0.00            +0.1        0.11 ± 17%  perf-profile.children.cycles-pp.dequeue_task_fair
      0.00            +0.1        0.11 ± 18%  perf-profile.children.cycles-pp.try_to_block_task
      0.01 ±223%      +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.enqueue_task
      0.00            +0.1        0.13 ±  8%  perf-profile.children.cycles-pp.enqueue_task_fair
      0.01 ±223%      +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.ttwu_do_activate
      0.05 ±  7%      +0.2        0.20 ± 11%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.00            +0.2        0.18 ± 11%  perf-profile.children.cycles-pp.try_to_wake_up
      0.07 ± 14%      +0.2        0.25 ± 18%  perf-profile.children.cycles-pp.kthread
      0.07 ±  8%      +0.2        0.25 ± 18%  perf-profile.children.cycles-pp.ret_from_fork
      0.07 ±  8%      +0.2        0.25 ± 19%  perf-profile.children.cycles-pp.ret_from_fork_asm
      0.02 ±141%      +0.2        0.20 ± 20%  perf-profile.children.cycles-pp.schedule
      0.00            +0.2        0.19 ± 24%  perf-profile.children.cycles-pp.smpboot_thread_fn
      0.05 ±  8%      +0.2        0.25 ± 21%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.common_startup_64
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.cpu_startup_entry
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.do_idle
      0.09 ± 39%      +0.2        0.34 ± 20%  perf-profile.children.cycles-pp.__schedule
     97.30            -0.4       96.86        perf-profile.self.cycles-pp.main
      0.02 ±141%      +0.0        0.06 ± 14%  perf-profile.self.cycles-pp.poll_idle




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ