lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202510231459.ad690ecd-lkp@intel.com>
Date: Thu, 23 Oct 2025 15:26:05 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Chen Yu <yu.c.chen@...el.com>,
	Tim Chen <tim.c.chen@...ux.intel.com>, <linux-mm@...ck.org>,
	<linux-kernel@...r.kernel.org>, <aubrey.li@...ux.intel.com>, Peter Zijlstra
	<peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, K Prateek Nayak
	<kprateek.nayak@....com>, "Gautham R . Shenoy" <gautham.shenoy@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>, Juri Lelli
	<juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, "Steven
 Rostedt" <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Madadi Vineeth
 Reddy" <vineethr@...ux.ibm.com>, Hillf Danton <hdanton@...a.com>, "Shrikanth
 Hegde" <sshegde@...ux.ibm.com>, Jianyong Wu <jianyong.wu@...look.com>,
	"Yangyu Chen" <cyy@...self.name>, Tingyin Duan <tingyin.duan@...il.com>, Vern
 Hao <vernhao@...cent.com>, Len Brown <len.brown@...el.com>, Aubrey Li
	<aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>, Chen Yu
	<yu.chen.surf@...il.com>, Libo Chen <libo.chen@...cle.com>, Adam Li
	<adamli@...amperecomputing.com>, Tim Chen <tim.c.chen@...el.com>,
	<oliver.sang@...el.com>
Subject: Re: [PATCH 01/19] sched/fair: Add infrastructure for cache-aware
 load balancing



Hello,

kernel test robot noticed a 5.1% improvement of will-it-scale.per_thread_ops on:


commit: ddf7df94672b42db9a86b3225cf9ebcfdfefc506 ("[PATCH 01/19] sched/fair: Add infrastructure for cache-aware load balancing")
url: https://github.com/intel-lab-lkp/linux/commits/Tim-Chen/sched-fair-Add-infrastructure-for-cache-aware-load-balancing/20251012-022248
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 45b7f780739a3145aeef24d2dfa02517a6c82ed6
patch link: https://lore.kernel.org/all/865b852e3fdef6561c9e0a5be9a94aec8a68cdea.1760206683.git.tim.c.chen@linux.intel.com/
patch subject: [PATCH 01/19] sched/fair: Add infrastructure for cache-aware load balancing

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_task: 100%
	mode: thread
	test: mmap1
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251023/202510231459.ad690ecd-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-icl-2sp7/mmap1/will-it-scale

commit: 
  45b7f78073 ("sched: Fix some typos in include/linux/preempt.h")
  ddf7df9467 ("sched/fair: Add infrastructure for cache-aware load balancing")

45b7f780739a3145 ddf7df94672b42db9a86b3225cf 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     12.30 ±  4%      +1.2       13.54 ±  2%  turbostat.C6%
     16510 ±  2%     +11.0%      18333 ±  5%  perf-c2c.HITM.local
     14007 ±  2%     +11.9%      15670 ±  5%  perf-c2c.HITM.remote
     30518 ±  2%     +11.4%      34003 ±  5%  perf-c2c.HITM.total
      4216 ±117%     -82.9%     720.83 ± 87%  sched_debug.cfs_rq:/.load_avg.max
    230030 ±  6%     +21.9%     280349 ±  3%  sched_debug.cpu.nr_switches.min
  32696287          +100.0%   65398751        sched_debug.sysctl_sched.sysctl_sched_features
     71075            +5.1%      74710        will-it-scale.64.threads
      1109            +5.1%       1166        will-it-scale.per_thread_ops
     71075            +5.1%      74710        will-it-scale.workload
  20012272            +1.8%   20374679        perf-stat.i.branch-misses
  20662368            +4.4%   21568525        perf-stat.i.cache-references
    181255 ±  6%     +10.5%     200376 ±  2%  perf-stat.i.context-switches
    177.39            +5.8%     187.65        perf-stat.i.cpu-migrations
     22226 ±  3%      -5.9%      20924 ±  2%  perf-stat.i.cycles-between-cache-misses
      2.83 ±  6%     +10.5%       3.13 ±  2%  perf-stat.i.metric.K/sec
      0.26            +0.0        0.27        perf-stat.overall.branch-miss-rate%
 1.609e+08            -5.6%  1.518e+08        perf-stat.overall.path-length
  19825311            +1.9%   20209803        perf-stat.ps.branch-misses
  20712472            +4.4%   21614327        perf-stat.ps.cache-references
    180352 ±  6%     +10.5%     199276 ±  2%  perf-stat.ps.context-switches
    177.05            +5.8%     187.32        perf-stat.ps.cpu-migrations
     47.47            -0.1       47.36        perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap
     48.49            -0.1       48.40        perf-profile.calltrace.cycles-pp.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     48.39            -0.1       48.30        perf-profile.calltrace.cycles-pp.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
     47.36            -0.1       47.27        perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64
     48.44            -0.1       48.35        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64
     48.34            -0.1       48.25        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
     49.70            -0.1       49.62        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     49.70            -0.1       49.62        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     49.71            -0.1       49.64        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     49.73            -0.1       49.66        perf-profile.calltrace.cycles-pp.__munmap
     49.71            -0.1       49.64        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
     49.11            -0.1       49.06        perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
      0.52            +0.0        0.54        perf-profile.calltrace.cycles-pp.vms_gather_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
      0.54 ±  2%      +0.0        0.56        perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
      0.67 ±  3%      +0.1        0.74 ±  2%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      0.68 ±  3%      +0.1        0.74 ±  2%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      0.71 ±  3%      +0.1        0.78 ±  2%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      0.99 ±  3%      +0.1        1.10 ±  2%  perf-profile.calltrace.cycles-pp.common_startup_64
      0.97 ±  3%      +0.1        1.08 ±  2%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      0.97 ±  3%      +0.1        1.08 ±  2%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
      0.97 ±  3%      +0.1        1.08 ±  2%  perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
     94.85            -0.2       94.65        perf-profile.children.cycles-pp.osq_lock
     96.81            -0.2       96.64        perf-profile.children.cycles-pp.rwsem_down_write_slowpath
     96.87            -0.2       96.70        perf-profile.children.cycles-pp.down_write_killable
     98.91            -0.1       98.80        perf-profile.children.cycles-pp.do_syscall_64
     98.91            -0.1       98.80        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     49.70            -0.1       49.62        perf-profile.children.cycles-pp.__x64_sys_munmap
     49.70            -0.1       49.62        perf-profile.children.cycles-pp.__vm_munmap
     49.73            -0.1       49.66        perf-profile.children.cycles-pp.__munmap
      0.05            +0.0        0.06        perf-profile.children.cycles-pp.wake_q_add
      0.23            +0.0        0.24        perf-profile.children.cycles-pp.kmem_cache_free
      0.15 ±  2%      +0.0        0.16        perf-profile.children.cycles-pp.anon_vma_clone
      0.07            +0.0        0.08 ±  4%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      0.07 ±  5%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.08 ±  4%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.__pick_next_task
      0.07            +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
      0.22            +0.0        0.24 ±  3%  perf-profile.children.cycles-pp.vma_expand
      0.09 ±  4%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.dequeue_task_fair
      0.10 ±  3%      +0.0        0.12 ±  3%  perf-profile.children.cycles-pp.schedule_idle
      0.08 ±  4%      +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.dequeue_entity
      0.09 ±  4%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.try_to_block_task
      0.45            +0.0        0.47        perf-profile.children.cycles-pp.__split_vma
      0.08 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.dequeue_entities
      0.16 ±  2%      +0.0        0.18 ±  4%  perf-profile.children.cycles-pp.commit_merge
      0.16            +0.0        0.18 ±  5%  perf-profile.children.cycles-pp.vma_complete
      0.12 ±  7%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.25            +0.0        0.27 ±  3%  perf-profile.children.cycles-pp.vma_merge_new_range
      0.54            +0.0        0.56        perf-profile.children.cycles-pp.do_mmap
      0.08 ±  5%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.05 ±  7%      +0.0        0.08 ±  4%  perf-profile.children.cycles-pp.update_curr
      0.04 ± 44%      +0.0        0.08 ±  6%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.41            +0.0        0.44 ±  2%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.39            +0.0        0.43 ±  2%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.18 ±  4%      +0.0        0.21 ±  3%  perf-profile.children.cycles-pp.schedule
      0.17 ±  3%      +0.0        0.21 ±  2%  perf-profile.children.cycles-pp.schedule_preempt_disabled
      0.18 ±  2%      +0.0        0.22 ±  2%  perf-profile.children.cycles-pp.try_to_wake_up
      0.18 ±  4%      +0.0        0.22 ±  3%  perf-profile.children.cycles-pp.wake_up_q
      0.35            +0.0        0.39 ±  2%  perf-profile.children.cycles-pp.update_process_times
      0.24 ±  3%      +0.0        0.28 ±  3%  perf-profile.children.cycles-pp.sched_tick
      0.25 ±  3%      +0.0        0.29 ±  3%  perf-profile.children.cycles-pp.rwsem_wake
      0.07            +0.0        0.12 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock
      0.28 ±  2%      +0.0        0.33 ±  2%  perf-profile.children.cycles-pp.__schedule
      0.18 ±  3%      +0.1        0.23 ±  5%  perf-profile.children.cycles-pp.task_tick_fair
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.update_se
      0.69 ±  3%      +0.1        0.76 ±  2%  perf-profile.children.cycles-pp.cpuidle_enter
      0.69 ±  3%      +0.1        0.76 ±  2%  perf-profile.children.cycles-pp.cpuidle_enter_state
      0.72 ±  3%      +0.1        0.79 ±  2%  perf-profile.children.cycles-pp.cpuidle_idle_call
      0.36 ±  3%      +0.1        0.44 ±  2%  perf-profile.children.cycles-pp.intel_idle_irq
      0.99 ±  3%      +0.1        1.10 ±  2%  perf-profile.children.cycles-pp.common_startup_64
      0.99 ±  3%      +0.1        1.10 ±  2%  perf-profile.children.cycles-pp.cpu_startup_entry
      0.99 ±  3%      +0.1        1.10 ±  2%  perf-profile.children.cycles-pp.do_idle
      0.97 ±  3%      +0.1        1.08 ±  2%  perf-profile.children.cycles-pp.start_secondary
     94.33            -0.2       94.10        perf-profile.self.cycles-pp.osq_lock
      0.32            -0.0        0.29 ±  2%  perf-profile.self.cycles-pp.rwsem_down_write_slowpath
      0.07 ±  5%      -0.0        0.05        perf-profile.self.cycles-pp.vms_complete_munmap_vmas
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.wake_q_add
      0.06            +0.0        0.07 ±  5%  perf-profile.self.cycles-pp._raw_spin_lock
      0.06 ±  6%      +0.0        0.08 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.04 ± 44%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.34 ±  4%      +0.1        0.42        perf-profile.self.cycles-pp.intel_idle_irq




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ