lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202502251026.bb927780-lkp@intel.com>
Date: Tue, 25 Feb 2025 10:32:13 +0800
From: kernel test robot <oliver.sang@...el.com>
To: zihan zhou <15645113830zzh@...il.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Vincent Guittot
	<vincent.guittot@...aro.org>, <aubrey.li@...ux.intel.com>,
	<yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [tip:sched/core] [sched]  2ae891b826:  hackbench.throughput 6.2%
 regression



Hello,

kernel test robot noticed a 6.2% regression of hackbench.throughput on:


commit: 2ae891b826958b60919ea21c727f77bcd6ffcc2c ("sched: Reduce the default slice to avoid tasks getting an extra tick")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core

[test failed on linux-next/master d4b0fd87ff0d4338b259dc79b2b3c6f7e70e8afa]

testcase: hackbench
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	iterations: 4
	mode: process
	ipc: socket
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.membarrier.ops_per_sec  10.5% regression                             |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | nr_threads=100%                                                                           |
|                  | test=membarrier                                                                           |
|                  | testtime=60s                                                                              |
+------------------+-------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202502251026.bb927780-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250225/202502251026.bb927780-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
  gcc-12/performance/socket/4/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench

commit: 
  f553741ac8 ("sched: Cancel the slice protection of the idle entity")
  2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")

f553741ac8c0e467 2ae891b826958b60919ea21c727 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      5457 ±  6%     +30.9%       7146 ± 11%  perf-c2c.DRAM.remote
      1156 ± 17%     +76.3%       2038 ± 19%  perf-c2c.HITM.remote
    790654 ±  2%     +22.8%     971104        sched_debug.cpu.nr_switches.avg
    659209 ±  2%     +24.6%     821703 ±  3%  sched_debug.cpu.nr_switches.min
   1706905           +20.0%    2047861        vmstat.system.cs
    296017            +5.8%     313318 ±  2%  vmstat.system.in
     15076 ± 48%    +121.3%      33360 ± 35%  proc-vmstat.numa_pages_migrated
   3389933 ±  5%     +15.3%    3907919 ±  3%  proc-vmstat.pgalloc_normal
   2565152 ±  6%     +27.9%    3280218 ±  5%  proc-vmstat.pgfree
     15076 ± 48%    +121.3%      33360 ± 35%  proc-vmstat.pgmigrate_success
    781.28 ± 57%    -100.0%       0.08 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
      3394 ± 51%    -100.0%       0.08 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
      0.18 ± 74%   +3280.0%       6.22 ±125%  perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     42.40 ± 41%     -62.7%      15.83 ± 60%  perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     86.80 ± 42%     -89.4%       9.17 ± 97%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.unlink_anon_vmas
    977.49 ± 51%     -99.9%       0.95 ±223%  perf-sched.wait_time.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
      3397 ± 50%    -100.0%       0.95 ±223%  perf-sched.wait_time.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
    433157            -6.2%     406447        hackbench.throughput
    423258            -6.9%     394005        hackbench.throughput_avg
    433157            -6.2%     406447        hackbench.throughput_best
    411374            -6.8%     383238        hackbench.throughput_worst
    143.13            +7.3%     153.65        hackbench.time.elapsed_time
    143.13            +7.3%     153.65        hackbench.time.elapsed_time.max
  39754543 ±  3%     +56.8%   62349308        hackbench.time.involuntary_context_switches
    623881            +3.9%     648284        hackbench.time.minor_page_faults
     17045            +7.7%      18350        hackbench.time.system_time
    900.50            +2.5%     922.71        hackbench.time.user_time
 2.019e+08           +23.3%  2.489e+08        hackbench.time.voluntary_context_switches
      1.61            -2.3%       1.57        perf-stat.i.MPKI
 4.411e+10            -5.0%  4.192e+10        perf-stat.i.branch-instructions
      0.41 ±  2%      +0.0        0.44        perf-stat.i.branch-miss-rate%
 1.744e+08            +1.6%  1.772e+08        perf-stat.i.branch-misses
     25.15            -0.6       24.50        perf-stat.i.cache-miss-rate%
   3.5e+08            -7.0%  3.255e+08        perf-stat.i.cache-misses
 1.398e+09            -3.8%  1.346e+09        perf-stat.i.cache-references
   1677956 ±  2%     +20.8%    2027400        perf-stat.i.context-switches
      1.49            +5.6%       1.57        perf-stat.i.cpi
     46084 ±  8%     +44.6%      66621 ±  8%  perf-stat.i.cpu-migrations
    935.91            +8.3%       1013        perf-stat.i.cycles-between-cache-misses
 2.175e+11            -5.1%  2.065e+11        perf-stat.i.instructions
      0.68            -5.2%       0.64        perf-stat.i.ipc
     13.38 ±  2%     +21.7%      16.28        perf-stat.i.metric.K/sec
      1.61            -2.0%       1.58        perf-stat.overall.MPKI
      0.39            +0.0        0.42        perf-stat.overall.branch-miss-rate%
     25.05            -0.8       24.23        perf-stat.overall.cache-miss-rate%
      1.49            +5.5%       1.57        perf-stat.overall.cpi
    926.46            +7.6%     996.92        perf-stat.overall.cycles-between-cache-misses
      0.67            -5.2%       0.64        perf-stat.overall.ipc
 4.382e+10            -5.0%  4.164e+10        perf-stat.ps.branch-instructions
  1.73e+08            +1.5%  1.755e+08        perf-stat.ps.branch-misses
 3.475e+08            -7.0%  3.233e+08        perf-stat.ps.cache-misses
 1.387e+09            -3.8%  1.334e+09        perf-stat.ps.cache-references
   1662988 ±  2%     +20.6%    2004942        perf-stat.ps.context-switches
     44600 ±  8%     +43.7%      64072 ±  7%  perf-stat.ps.cpu-migrations
 2.161e+11            -5.1%  2.051e+11        perf-stat.ps.instructions
 3.105e+13            +2.0%  3.169e+13        perf-stat.total.instructions
      8.54 ±  2%      -1.0        7.54        perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
      8.46 ±  2%      -1.0        7.47        perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
      8.30 ±  2%      -1.0        7.31        perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg
      4.38 ±  2%      -0.6        3.81 ±  2%  perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
      3.20 ±  3%      -0.3        2.85        perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
      3.00 ±  3%      -0.3        2.67        perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
      3.40 ±  3%      -0.3        3.10 ±  3%  perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
      2.30 ±  3%      -0.3        2.00        perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
      3.25 ±  3%      -0.3        2.97 ±  3%  perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
      3.07 ±  2%      -0.3        2.79 ±  2%  perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
      3.05 ±  3%      -0.3        2.79 ±  3%  perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic
      2.50 ±  3%      -0.2        2.29 ±  2%  perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.__kmalloc_node_track_caller_noprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
      2.18 ±  3%      -0.2        1.99        perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_noprof.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
      1.99 ±  3%      -0.2        1.82        perf-profile.calltrace.cycles-pp.clear_bhb_loop.write
      1.95 ±  4%      -0.2        1.78 ±  2%  perf-profile.calltrace.cycles-pp.clear_bhb_loop.read
      2.68 ±  3%      -0.1        2.54 ±  2%  perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write
      1.55 ±  3%      -0.1        1.42        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
      1.55 ±  3%      -0.1        1.42        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
      1.35 ±  3%      -0.1        1.24 ±  3%  perf-profile.calltrace.cycles-pp.__slab_free.kfree.skb_release_data.consume_skb.unix_stream_read_generic
      1.04 ±  3%      -0.1        0.96 ±  3%  perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
      1.12 ±  3%      -0.1        1.04        perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write
      0.62 ±  4%      -0.1        0.56        perf-profile.calltrace.cycles-pp.__build_skb_around.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
      0.72 ±  3%      -0.1        0.66        perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg
      0.63 ±  2%      -0.1        0.57 ±  3%  perf-profile.calltrace.cycles-pp.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
      0.57 ±  3%      -0.0        0.52 ±  2%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter
      1.17 ±  3%      +0.2        1.32 ±  6%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.42 ± 50%      +0.3        0.76 ± 22%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
      1.36 ±  3%      +0.5        1.88 ± 21%  perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic
      1.38 ±  3%      +0.5        1.91 ± 21%  perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
      1.43 ±  3%      +0.5        1.98 ± 21%  perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
      1.63 ±  3%      +0.7        2.28 ± 21%  perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
     36.49            +0.8       37.34        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     35.51            +0.9       36.43        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     38.59            +0.9       39.52        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
     38.32            +1.0       39.27        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     34.44            +1.0       35.42        perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     33.04            +1.1       34.12        perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
      8.58 ±  2%      -1.0        7.58        perf-profile.children.cycles-pp.unix_stream_read_actor
      8.35 ±  2%      -1.0        7.36        perf-profile.children.cycles-pp.__skb_datagram_iter
      8.50 ±  2%      -1.0        7.51        perf-profile.children.cycles-pp.skb_copy_datagram_iter
      4.40 ±  2%      -0.6        3.83 ±  2%  perf-profile.children.cycles-pp._copy_to_iter
      5.77 ±  2%      -0.4        5.32 ±  3%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
      4.41 ±  3%      -0.4        3.98        perf-profile.children.cycles-pp.__check_object_size
      4.80 ±  3%      -0.4        4.40        perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      3.24 ±  3%      -0.4        2.89        perf-profile.children.cycles-pp.simple_copy_to_iter
      2.98 ±  3%      -0.3        2.64        perf-profile.children.cycles-pp.check_heap_object
      3.98 ±  3%      -0.3        3.64        perf-profile.children.cycles-pp.clear_bhb_loop
      3.44 ±  2%      -0.3        3.14 ±  3%  perf-profile.children.cycles-pp.skb_release_head_state
      3.31 ±  2%      -0.3        3.03 ±  3%  perf-profile.children.cycles-pp.unix_destruct_scm
      3.09 ±  3%      -0.3        2.82 ±  3%  perf-profile.children.cycles-pp.sock_wfree
      2.42 ±  3%      -0.2        2.23 ±  3%  perf-profile.children.cycles-pp.__slab_free
      2.59 ±  2%      -0.2        2.42 ±  2%  perf-profile.children.cycles-pp.mod_objcg_state
      1.78 ±  3%      -0.2        1.62        perf-profile.children.cycles-pp.entry_SYSCALL_64
      2.76 ±  3%      -0.1        2.61 ±  2%  perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
      1.38 ±  3%      -0.1        1.25        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.30 ±  4%      -0.1        1.19        perf-profile.children.cycles-pp.obj_cgroup_charge
      0.65 ±  4%      -0.1        0.57        perf-profile.children.cycles-pp.__build_skb_around
      0.66 ±  3%      -0.1        0.61        perf-profile.children.cycles-pp.refill_obj_stock
      0.73 ±  3%      -0.1        0.68        perf-profile.children.cycles-pp.__check_heap_object
      0.59 ±  3%      -0.1        0.54 ±  2%  perf-profile.children.cycles-pp.rw_verify_area
      0.66 ±  2%      -0.1        0.61 ±  3%  perf-profile.children.cycles-pp.skb_unlink
      0.55 ±  4%      -0.0        0.51 ±  2%  perf-profile.children.cycles-pp.__virt_addr_valid
      0.28 ±  3%      -0.0        0.26        perf-profile.children.cycles-pp.__scm_recv_common
      0.16 ±  4%      -0.0        0.14 ±  3%  perf-profile.children.cycles-pp.is_vmalloc_addr
      0.16 ±  3%      -0.0        0.14 ±  2%  perf-profile.children.cycles-pp.security_socket_recvmsg
      0.17 ±  2%      -0.0        0.16        perf-profile.children.cycles-pp.put_pid
      0.14 ±  3%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.manage_oob
      0.11            -0.0        0.10        perf-profile.children.cycles-pp.wait_for_unix_gc
      0.06 ±  6%      +0.0        0.08 ± 11%  perf-profile.children.cycles-pp.os_xsave
      0.20 ±  3%      +0.0        0.23 ±  7%  perf-profile.children.cycles-pp.__get_user_8
      0.06 ±  6%      +0.0        0.09 ± 17%  perf-profile.children.cycles-pp.sched_clock
      0.06 ±  6%      +0.0        0.09 ± 14%  perf-profile.children.cycles-pp.check_preempt_wakeup_fair
      0.09 ±  5%      +0.0        0.13 ± 18%  perf-profile.children.cycles-pp.__switch_to
      0.08 ±  4%      +0.0        0.12 ± 21%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
      0.15 ±  6%      +0.0        0.20 ± 10%  perf-profile.children.cycles-pp.__dequeue_entity
      0.25 ±  3%      +0.0        0.29 ±  9%  perf-profile.children.cycles-pp.rseq_ip_fixup
      0.09 ± 10%      +0.0        0.14 ± 15%  perf-profile.children.cycles-pp.pick_eevdf
      0.13 ±  7%      +0.0        0.18 ± 14%  perf-profile.children.cycles-pp.__enqueue_entity
      0.08 ± 10%      +0.0        0.12 ± 27%  perf-profile.children.cycles-pp.wakeup_preempt
      0.01 ±200%      +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.vruntime_eligible
      0.01 ±200%      +0.1        0.07 ± 23%  perf-profile.children.cycles-pp.___perf_sw_event
      0.01 ±200%      +0.1        0.08 ± 27%  perf-profile.children.cycles-pp.put_prev_entity
      0.31 ±  2%      +0.1        0.38 ± 12%  perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      0.22 ±  7%      +0.1        0.30 ± 12%  perf-profile.children.cycles-pp.set_next_entity
      0.14 ±  5%      +0.1        0.22 ± 22%  perf-profile.children.cycles-pp.pick_task_fair
      0.14 ± 44%      +0.1        0.24 ± 15%  perf-profile.children.cycles-pp.get_any_partial
      0.27 ±  5%      +0.1        0.37 ± 15%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.33 ±  4%      +0.1        0.47 ± 22%  perf-profile.children.cycles-pp.enqueue_entity
      0.30 ±  4%      +0.2        0.46 ± 26%  perf-profile.children.cycles-pp.update_load_avg
      0.48 ±  4%      +0.2        0.72 ± 26%  perf-profile.children.cycles-pp.enqueue_task_fair
      0.51 ±  3%      +0.2        0.75 ± 27%  perf-profile.children.cycles-pp.enqueue_task
      0.48 ±  6%      +0.3        0.75 ± 23%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.49 ±  6%      +0.3        0.76 ± 24%  perf-profile.children.cycles-pp.__pick_next_task
      0.60 ±  4%      +0.3        0.89 ± 25%  perf-profile.children.cycles-pp.ttwu_do_activate
      1.67 ±  2%      +0.6        2.23 ± 20%  perf-profile.children.cycles-pp.schedule_timeout
      1.64 ±  3%      +0.7        2.29 ± 21%  perf-profile.children.cycles-pp.unix_stream_data_wait
      1.78 ±  4%      +0.7        2.53 ± 21%  perf-profile.children.cycles-pp.schedule
      1.78 ±  4%      +0.8        2.54 ± 22%  perf-profile.children.cycles-pp.__schedule
     36.58            +0.8       37.42        perf-profile.children.cycles-pp.ksys_write
     35.60            +0.9       36.51        perf-profile.children.cycles-pp.vfs_write
     34.52            +1.0       35.49        perf-profile.children.cycles-pp.sock_write_iter
     33.31            +1.0       34.36        perf-profile.children.cycles-pp.unix_stream_sendmsg
      4.37 ±  2%      -0.6        3.79 ±  2%  perf-profile.self.cycles-pp._copy_to_iter
      3.94 ±  3%      -0.3        3.60        perf-profile.self.cycles-pp.clear_bhb_loop
      2.27 ±  3%      -0.3        1.98        perf-profile.self.cycles-pp.check_heap_object
      3.29 ±  2%      -0.3        3.01 ±  5%  perf-profile.self.cycles-pp.__memcg_slab_free_hook
      2.03 ±  4%      -0.3        1.76 ±  2%  perf-profile.self.cycles-pp.kmem_cache_free
      2.50 ±  3%      -0.2        2.25 ±  3%  perf-profile.self.cycles-pp.sock_wfree
      2.61 ±  2%      -0.2        2.37 ±  3%  perf-profile.self.cycles-pp.unix_stream_read_generic
      2.30 ±  3%      -0.2        2.09        perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
      2.37 ±  3%      -0.2        2.18 ±  3%  perf-profile.self.cycles-pp.__slab_free
      1.04 ±  4%      -0.2        0.86 ±  5%  perf-profile.self.cycles-pp.skb_release_data
      2.19 ±  4%      -0.2        2.01 ±  2%  perf-profile.self.cycles-pp.mod_objcg_state
      1.31 ±  3%      -0.1        1.18        perf-profile.self.cycles-pp.__kmalloc_node_track_caller_noprof
      1.33 ±  3%      -0.1        1.21        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.04 ±  3%      -0.1        0.93        perf-profile.self.cycles-pp.kmem_cache_alloc_node_noprof
      1.13 ±  3%      -0.1        1.02        perf-profile.self.cycles-pp.__alloc_skb
      1.38 ±  2%      -0.1        1.29 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.74 ±  3%      -0.1        0.66        perf-profile.self.cycles-pp.__skb_datagram_iter
      1.11 ±  3%      -0.1        1.03 ±  2%  perf-profile.self.cycles-pp.sock_write_iter
      0.80 ±  3%      -0.1        0.74 ±  2%  perf-profile.self.cycles-pp.write
      0.60 ±  4%      -0.1        0.54        perf-profile.self.cycles-pp.__build_skb_around
      0.84 ±  4%      -0.1        0.78        perf-profile.self.cycles-pp.sock_read_iter
      0.69 ±  3%      -0.1        0.64        perf-profile.self.cycles-pp.__check_heap_object
      0.62 ±  3%      -0.1        0.57        perf-profile.self.cycles-pp.refill_obj_stock
      0.82            -0.0        0.77        perf-profile.self.cycles-pp.read
      0.80 ±  3%      -0.0        0.75 ±  3%  perf-profile.self.cycles-pp.do_syscall_64
      0.51 ±  4%      -0.0        0.47        perf-profile.self.cycles-pp.__virt_addr_valid
      0.46 ±  2%      -0.0        0.43 ±  2%  perf-profile.self.cycles-pp.kfree
      0.59 ±  3%      -0.0        0.56 ±  2%  perf-profile.self.cycles-pp.__check_object_size
      0.44 ±  2%      -0.0        0.41        perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.36 ±  3%      -0.0        0.32 ±  2%  perf-profile.self.cycles-pp.rw_verify_area
      0.43 ±  2%      -0.0        0.40 ±  2%  perf-profile.self.cycles-pp.unix_write_space
      0.37 ±  4%      -0.0        0.34 ±  2%  perf-profile.self.cycles-pp.x64_sys_call
      0.34 ±  3%      -0.0        0.31 ±  2%  perf-profile.self.cycles-pp.__cond_resched
      0.29 ±  3%      -0.0        0.27        perf-profile.self.cycles-pp.ksys_write
      0.30 ±  2%      -0.0        0.28 ±  2%  perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
      0.18 ±  4%      -0.0        0.16 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.18 ±  2%      -0.0        0.17 ±  2%  perf-profile.self.cycles-pp.unix_destruct_scm
      0.21 ±  3%      -0.0        0.19        perf-profile.self.cycles-pp.__scm_recv_common
      0.25            -0.0        0.23 ±  2%  perf-profile.self.cycles-pp.kmalloc_reserve
      0.15 ±  2%      -0.0        0.14        perf-profile.self.cycles-pp.skb_unlink
      0.15 ±  2%      -0.0        0.14        perf-profile.self.cycles-pp.unix_scm_to_skb
      0.07 ±  9%      +0.0        0.10 ± 19%  perf-profile.self.cycles-pp.pick_eevdf
      0.09 ±  5%      +0.0        0.13 ± 16%  perf-profile.self.cycles-pp.__switch_to
      0.11 ±  7%      +0.0        0.14 ± 10%  perf-profile.self.cycles-pp.__dequeue_entity
      0.08 ±  5%      +0.0        0.12 ± 22%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
      0.02 ±122%      +0.0        0.06 ± 17%  perf-profile.self.cycles-pp.native_sched_clock
      0.13 ±  8%      +0.1        0.18 ± 14%  perf-profile.self.cycles-pp.__enqueue_entity
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.vruntime_eligible
      0.27 ±  5%      +0.1        0.37 ± 15%  perf-profile.self.cycles-pp.switch_mm_irqs_off


***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/membarrier/stress-ng/60s

commit: 
  f553741ac8 ("sched: Cancel the slice protection of the idle entity")
  2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")

f553741ac8c0e467 2ae891b826958b60919ea21c727 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.08            -0.1        0.99        mpstat.cpu.all.irq%
     67.18 ±  2%     -11.9%      59.20 ±  5%  mpstat.max_utilization_pct
      3401 ± 19%     -31.4%       2332 ± 18%  perf-c2c.DRAM.remote
      2396 ±  3%     -23.1%       1844 ± 18%  perf-c2c.HITM.remote
     29248           +14.3%      33418        vmstat.system.cs
    788485            -9.1%     716631        vmstat.system.in
    191106            -1.7%     187946        proc-vmstat.nr_anon_pages
    535277 ±  2%      +5.6%     565009 ±  4%  proc-vmstat.numa_hit
    469052 ±  2%      +6.3%     498763 ±  5%  proc-vmstat.numa_local
     51285 ±  7%     +54.3%      79119 ± 31%  proc-vmstat.numa_pages_migrated
     51285 ±  7%     +54.3%      79119 ± 31%  proc-vmstat.pgmigrate_success
     16417 ±  7%    +131.4%      37986 ± 78%  proc-vmstat.pgreuse
    505.28           -10.6%     451.92        stress-ng.membarrier.membarrier_calls_per_sec
     97160           -10.5%      86939        stress-ng.membarrier.ops
      1618           -10.5%       1448        stress-ng.membarrier.ops_per_sec
     55094 ±  5%    +277.5%     207976 ±  9%  stress-ng.time.involuntary_context_switches
      3195 ±  2%      -8.3%       2931        stress-ng.time.percent_of_cpu_this_job_got
      1921 ±  2%      -8.3%       1761        stress-ng.time.system_time
   1047923            +5.9%    1109900        stress-ng.time.voluntary_context_switches
 5.501e+09 ±  2%      -7.8%  5.074e+09        perf-stat.i.branch-instructions
     30090           +14.4%      34431        perf-stat.i.context-switches
 1.041e+11 ±  2%      -7.6%  9.627e+10        perf-stat.i.cpu-cycles
     10683            +6.7%      11402        perf-stat.i.cpu-migrations
  2.73e+10 ±  2%      -7.6%  2.522e+10        perf-stat.i.instructions
 5.406e+09 ±  2%      -7.8%  4.985e+09        perf-stat.ps.branch-instructions
     29571           +14.4%      33836        perf-stat.ps.context-switches
 1.024e+11 ±  2%      -7.6%   9.46e+10        perf-stat.ps.cpu-cycles
     10498            +6.7%      11203        perf-stat.ps.cpu-migrations
 2.683e+10 ±  2%      -7.6%  2.478e+10        perf-stat.ps.instructions
 1.631e+12 ±  2%      -7.7%  1.505e+12        perf-stat.total.instructions
    698086 ±  4%     -12.0%     614339 ±  3%  sched_debug.cfs_rq:/.avg_vruntime.avg
    918198 ±  7%     -13.5%     794083 ±  6%  sched_debug.cfs_rq:/.avg_vruntime.max
    650282 ±  4%     -12.9%     566525 ±  4%  sched_debug.cfs_rq:/.avg_vruntime.min
    698086 ±  4%     -12.0%     614339 ±  3%  sched_debug.cfs_rq:/.min_vruntime.avg
    918198 ±  7%     -13.5%     794083 ±  6%  sched_debug.cfs_rq:/.min_vruntime.max
    650282 ±  4%     -12.9%     566525 ±  4%  sched_debug.cfs_rq:/.min_vruntime.min
     13.48 ± 36%    +250.6%      47.25 ± 40%  sched_debug.cfs_rq:/.removed.load_avg.avg
     77.26 ± 17%     +91.9%     148.27 ± 24%  sched_debug.cfs_rq:/.removed.load_avg.stddev
      5.08 ± 33%    +246.5%      17.60 ± 35%  sched_debug.cfs_rq:/.removed.runnable_avg.avg
    212.33 ± 20%     +30.1%     276.17 ±  7%  sched_debug.cfs_rq:/.removed.runnable_avg.max
     30.44 ± 21%     +89.0%      57.52 ± 14%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
      5.08 ± 33%    +246.6%      17.60 ± 35%  sched_debug.cfs_rq:/.removed.util_avg.avg
    212.25 ± 21%     +30.1%     276.08 ±  7%  sched_debug.cfs_rq:/.removed.util_avg.max
     30.43 ± 21%     +89.0%      57.51 ± 14%  sched_debug.cfs_rq:/.removed.util_avg.stddev
     15701           +12.8%      17719        sched_debug.cpu.nr_switches.avg
     11778 ±  7%     +20.3%      14165 ±  8%  sched_debug.cpu.nr_switches.min
   -202.17           +21.0%    -244.58        sched_debug.cpu.nr_uninterruptible.min
      1.43 ± 36%     -99.6%       0.01 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      0.94 ± 23%     -91.9%       0.08 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      1.60 ± 68%     -99.9%       0.00 ±223%  perf-sched.sch_delay.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
      1.39 ±  8%     +71.7%       2.38 ±  7%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
      1.95 ±  5%     +23.7%       2.41 ±  5%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
      0.89 ±  4%     -16.0%       0.75 ±  3%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
      0.01 ± 25%     +75.0%       0.02 ± 34%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.06 ± 11%     -37.5%       0.04 ± 40%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.80 ±145%    +478.8%       4.62 ± 52%  perf-sched.sch_delay.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
      5.29 ± 41%     -99.9%       0.01 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      6.37 ± 13%     -93.7%       0.40 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      2.22 ± 49%     -99.9%       0.00 ±223%  perf-sched.sch_delay.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
     10.40 ± 13%     +32.1%      13.74 ±  5%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
      4.55 ±  5%     -34.9%       2.96 ± 42%  perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.98 ±  4%     +33.4%       1.30 ±  6%  perf-sched.total_sch_delay.average.ms
     22.34           -12.3%      19.59        perf-sched.total_wait_and_delay.average.ms
    102076           +18.6%     121096        perf-sched.total_wait_and_delay.count.ms
     21.37           -14.4%      18.29        perf-sched.total_wait_time.average.ms
    515.07 ± 36%     +63.2%     840.46 ± 16%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     11.25 ±  5%     +56.4%      17.59 ±  7%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
     15.80           -13.5%      13.67        perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
    487.31 ±  4%     +16.4%     567.38 ±  2%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      8.00 ± 26%     +95.8%      15.67 ± 20%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1384 ± 12%     +58.1%       2188 ±  8%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
     10678 ±  7%    +270.1%      39521 ±  4%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
     85629           -12.4%      75039 ±  3%  perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
      2443 ± 44%     -58.3%       1018        perf-sched.wait_and_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      2099 ± 55%     -76.1%     501.21        perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     15.94 ±  9%     -86.6%       2.13 ±223%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
    515.06 ± 36%     +63.2%     840.45 ± 16%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     15.25 ±  3%     -85.0%       2.29 ±223%  perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
    427.24 ± 78%     -99.6%       1.55 ±107%  perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
     10.38 ± 53%     -95.2%       0.50 ±223%  perf-sched.wait_time.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
     48.58 ±185%     -94.2%       2.80 ± 99%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      9.86 ±  5%     +54.2%      15.21 ±  7%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
     14.92           -13.4%      12.92        perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
      1.30 ±  8%     -11.1%       1.15 ±  6%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    487.30 ±  4%     +16.4%     567.36 ±  2%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      6.13 ±141%    +268.5%      22.60 ± 17%  perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
     25.13 ±  9%     -91.5%       2.13 ±223%  perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
     25.92 ± 12%     -86.1%       3.61 ±223%  perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      2260 ± 59%     -99.9%       3.00 ±118%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
     13.02 ± 43%     -96.2%       0.50 ±223%  perf-sched.wait_time.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
      2443 ± 44%     -58.3%       1018        perf-sched.wait_time.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      2097 ± 55%     -76.1%     500.54        perf-sched.wait_time.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ