[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202502251026.bb927780-lkp@intel.com>
Date: Tue, 25 Feb 2025 10:32:13 +0800
From: kernel test robot <oliver.sang@...el.com>
To: zihan zhou <15645113830zzh@...il.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Vincent Guittot
<vincent.guittot@...aro.org>, <aubrey.li@...ux.intel.com>,
<yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [tip:sched/core] [sched] 2ae891b826: hackbench.throughput 6.2%
regression
Hello,
kernel test robot noticed a 6.2% regression of hackbench.throughput on:
commit: 2ae891b826958b60919ea21c727f77bcd6ffcc2c ("sched: Reduce the default slice to avoid tasks getting an extra tick")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
[test failed on linux-next/master d4b0fd87ff0d4338b259dc79b2b3c6f7e70e8afa]
testcase: hackbench
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
iterations: 4
mode: process
ipc: socket
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.membarrier.ops_per_sec 10.5% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=membarrier |
| | testtime=60s |
+------------------+-------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202502251026.bb927780-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250225/202502251026.bb927780-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
gcc-12/performance/socket/4/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
commit:
f553741ac8 ("sched: Cancel the slice protection of the idle entity")
2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")
f553741ac8c0e467 2ae891b826958b60919ea21c727
---------------- ---------------------------
%stddev %change %stddev
\ | \
5457 ± 6% +30.9% 7146 ± 11% perf-c2c.DRAM.remote
1156 ± 17% +76.3% 2038 ± 19% perf-c2c.HITM.remote
790654 ± 2% +22.8% 971104 sched_debug.cpu.nr_switches.avg
659209 ± 2% +24.6% 821703 ± 3% sched_debug.cpu.nr_switches.min
1706905 +20.0% 2047861 vmstat.system.cs
296017 +5.8% 313318 ± 2% vmstat.system.in
15076 ± 48% +121.3% 33360 ± 35% proc-vmstat.numa_pages_migrated
3389933 ± 5% +15.3% 3907919 ± 3% proc-vmstat.pgalloc_normal
2565152 ± 6% +27.9% 3280218 ± 5% proc-vmstat.pgfree
15076 ± 48% +121.3% 33360 ± 35% proc-vmstat.pgmigrate_success
781.28 ± 57% -100.0% 0.08 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
3394 ± 51% -100.0% 0.08 ±223% perf-sched.sch_delay.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
0.18 ± 74% +3280.0% 6.22 ±125% perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
42.40 ± 41% -62.7% 15.83 ± 60% perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
86.80 ± 42% -89.4% 9.17 ± 97% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.unlink_anon_vmas
977.49 ± 51% -99.9% 0.95 ±223% perf-sched.wait_time.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
3397 ± 50% -100.0% 0.95 ±223% perf-sched.wait_time.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
433157 -6.2% 406447 hackbench.throughput
423258 -6.9% 394005 hackbench.throughput_avg
433157 -6.2% 406447 hackbench.throughput_best
411374 -6.8% 383238 hackbench.throughput_worst
143.13 +7.3% 153.65 hackbench.time.elapsed_time
143.13 +7.3% 153.65 hackbench.time.elapsed_time.max
39754543 ± 3% +56.8% 62349308 hackbench.time.involuntary_context_switches
623881 +3.9% 648284 hackbench.time.minor_page_faults
17045 +7.7% 18350 hackbench.time.system_time
900.50 +2.5% 922.71 hackbench.time.user_time
2.019e+08 +23.3% 2.489e+08 hackbench.time.voluntary_context_switches
1.61 -2.3% 1.57 perf-stat.i.MPKI
4.411e+10 -5.0% 4.192e+10 perf-stat.i.branch-instructions
0.41 ± 2% +0.0 0.44 perf-stat.i.branch-miss-rate%
1.744e+08 +1.6% 1.772e+08 perf-stat.i.branch-misses
25.15 -0.6 24.50 perf-stat.i.cache-miss-rate%
3.5e+08 -7.0% 3.255e+08 perf-stat.i.cache-misses
1.398e+09 -3.8% 1.346e+09 perf-stat.i.cache-references
1677956 ± 2% +20.8% 2027400 perf-stat.i.context-switches
1.49 +5.6% 1.57 perf-stat.i.cpi
46084 ± 8% +44.6% 66621 ± 8% perf-stat.i.cpu-migrations
935.91 +8.3% 1013 perf-stat.i.cycles-between-cache-misses
2.175e+11 -5.1% 2.065e+11 perf-stat.i.instructions
0.68 -5.2% 0.64 perf-stat.i.ipc
13.38 ± 2% +21.7% 16.28 perf-stat.i.metric.K/sec
1.61 -2.0% 1.58 perf-stat.overall.MPKI
0.39 +0.0 0.42 perf-stat.overall.branch-miss-rate%
25.05 -0.8 24.23 perf-stat.overall.cache-miss-rate%
1.49 +5.5% 1.57 perf-stat.overall.cpi
926.46 +7.6% 996.92 perf-stat.overall.cycles-between-cache-misses
0.67 -5.2% 0.64 perf-stat.overall.ipc
4.382e+10 -5.0% 4.164e+10 perf-stat.ps.branch-instructions
1.73e+08 +1.5% 1.755e+08 perf-stat.ps.branch-misses
3.475e+08 -7.0% 3.233e+08 perf-stat.ps.cache-misses
1.387e+09 -3.8% 1.334e+09 perf-stat.ps.cache-references
1662988 ± 2% +20.6% 2004942 perf-stat.ps.context-switches
44600 ± 8% +43.7% 64072 ± 7% perf-stat.ps.cpu-migrations
2.161e+11 -5.1% 2.051e+11 perf-stat.ps.instructions
3.105e+13 +2.0% 3.169e+13 perf-stat.total.instructions
8.54 ± 2% -1.0 7.54 perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
8.46 ± 2% -1.0 7.47 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
8.30 ± 2% -1.0 7.31 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg
4.38 ± 2% -0.6 3.81 ± 2% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
3.20 ± 3% -0.3 2.85 perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
3.00 ± 3% -0.3 2.67 perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
3.40 ± 3% -0.3 3.10 ± 3% perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
2.30 ± 3% -0.3 2.00 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
3.25 ± 3% -0.3 2.97 ± 3% perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
3.07 ± 2% -0.3 2.79 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
3.05 ± 3% -0.3 2.79 ± 3% perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic
2.50 ± 3% -0.2 2.29 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.__kmalloc_node_track_caller_noprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
2.18 ± 3% -0.2 1.99 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_noprof.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
1.99 ± 3% -0.2 1.82 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write
1.95 ± 4% -0.2 1.78 ± 2% perf-profile.calltrace.cycles-pp.clear_bhb_loop.read
2.68 ± 3% -0.1 2.54 ± 2% perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write
1.55 ± 3% -0.1 1.42 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
1.55 ± 3% -0.1 1.42 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
1.35 ± 3% -0.1 1.24 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kfree.skb_release_data.consume_skb.unix_stream_read_generic
1.04 ± 3% -0.1 0.96 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.12 ± 3% -0.1 1.04 perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write
0.62 ± 4% -0.1 0.56 perf-profile.calltrace.cycles-pp.__build_skb_around.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
0.72 ± 3% -0.1 0.66 perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg
0.63 ± 2% -0.1 0.57 ± 3% perf-profile.calltrace.cycles-pp.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
0.57 ± 3% -0.0 0.52 ± 2% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter
1.17 ± 3% +0.2 1.32 ± 6% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
0.42 ± 50% +0.3 0.76 ± 22% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
1.36 ± 3% +0.5 1.88 ± 21% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic
1.38 ± 3% +0.5 1.91 ± 21% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
1.43 ± 3% +0.5 1.98 ± 21% perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.63 ± 3% +0.7 2.28 ± 21% perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
36.49 +0.8 37.34 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
35.51 +0.9 36.43 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
38.59 +0.9 39.52 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
38.32 +1.0 39.27 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
34.44 +1.0 35.42 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
33.04 +1.1 34.12 perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
8.58 ± 2% -1.0 7.58 perf-profile.children.cycles-pp.unix_stream_read_actor
8.35 ± 2% -1.0 7.36 perf-profile.children.cycles-pp.__skb_datagram_iter
8.50 ± 2% -1.0 7.51 perf-profile.children.cycles-pp.skb_copy_datagram_iter
4.40 ± 2% -0.6 3.83 ± 2% perf-profile.children.cycles-pp._copy_to_iter
5.77 ± 2% -0.4 5.32 ± 3% perf-profile.children.cycles-pp.__memcg_slab_free_hook
4.41 ± 3% -0.4 3.98 perf-profile.children.cycles-pp.__check_object_size
4.80 ± 3% -0.4 4.40 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
3.24 ± 3% -0.4 2.89 perf-profile.children.cycles-pp.simple_copy_to_iter
2.98 ± 3% -0.3 2.64 perf-profile.children.cycles-pp.check_heap_object
3.98 ± 3% -0.3 3.64 perf-profile.children.cycles-pp.clear_bhb_loop
3.44 ± 2% -0.3 3.14 ± 3% perf-profile.children.cycles-pp.skb_release_head_state
3.31 ± 2% -0.3 3.03 ± 3% perf-profile.children.cycles-pp.unix_destruct_scm
3.09 ± 3% -0.3 2.82 ± 3% perf-profile.children.cycles-pp.sock_wfree
2.42 ± 3% -0.2 2.23 ± 3% perf-profile.children.cycles-pp.__slab_free
2.59 ± 2% -0.2 2.42 ± 2% perf-profile.children.cycles-pp.mod_objcg_state
1.78 ± 3% -0.2 1.62 perf-profile.children.cycles-pp.entry_SYSCALL_64
2.76 ± 3% -0.1 2.61 ± 2% perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
1.38 ± 3% -0.1 1.25 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.30 ± 4% -0.1 1.19 perf-profile.children.cycles-pp.obj_cgroup_charge
0.65 ± 4% -0.1 0.57 perf-profile.children.cycles-pp.__build_skb_around
0.66 ± 3% -0.1 0.61 perf-profile.children.cycles-pp.refill_obj_stock
0.73 ± 3% -0.1 0.68 perf-profile.children.cycles-pp.__check_heap_object
0.59 ± 3% -0.1 0.54 ± 2% perf-profile.children.cycles-pp.rw_verify_area
0.66 ± 2% -0.1 0.61 ± 3% perf-profile.children.cycles-pp.skb_unlink
0.55 ± 4% -0.0 0.51 ± 2% perf-profile.children.cycles-pp.__virt_addr_valid
0.28 ± 3% -0.0 0.26 perf-profile.children.cycles-pp.__scm_recv_common
0.16 ± 4% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.is_vmalloc_addr
0.16 ± 3% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.security_socket_recvmsg
0.17 ± 2% -0.0 0.16 perf-profile.children.cycles-pp.put_pid
0.14 ± 3% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.manage_oob
0.11 -0.0 0.10 perf-profile.children.cycles-pp.wait_for_unix_gc
0.06 ± 6% +0.0 0.08 ± 11% perf-profile.children.cycles-pp.os_xsave
0.20 ± 3% +0.0 0.23 ± 7% perf-profile.children.cycles-pp.__get_user_8
0.06 ± 6% +0.0 0.09 ± 17% perf-profile.children.cycles-pp.sched_clock
0.06 ± 6% +0.0 0.09 ± 14% perf-profile.children.cycles-pp.check_preempt_wakeup_fair
0.09 ± 5% +0.0 0.13 ± 18% perf-profile.children.cycles-pp.__switch_to
0.08 ± 4% +0.0 0.12 ± 21% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.15 ± 6% +0.0 0.20 ± 10% perf-profile.children.cycles-pp.__dequeue_entity
0.25 ± 3% +0.0 0.29 ± 9% perf-profile.children.cycles-pp.rseq_ip_fixup
0.09 ± 10% +0.0 0.14 ± 15% perf-profile.children.cycles-pp.pick_eevdf
0.13 ± 7% +0.0 0.18 ± 14% perf-profile.children.cycles-pp.__enqueue_entity
0.08 ± 10% +0.0 0.12 ± 27% perf-profile.children.cycles-pp.wakeup_preempt
0.01 ±200% +0.1 0.06 ± 11% perf-profile.children.cycles-pp.vruntime_eligible
0.01 ±200% +0.1 0.07 ± 23% perf-profile.children.cycles-pp.___perf_sw_event
0.01 ±200% +0.1 0.08 ± 27% perf-profile.children.cycles-pp.put_prev_entity
0.31 ± 2% +0.1 0.38 ± 12% perf-profile.children.cycles-pp.__rseq_handle_notify_resume
0.22 ± 7% +0.1 0.30 ± 12% perf-profile.children.cycles-pp.set_next_entity
0.14 ± 5% +0.1 0.22 ± 22% perf-profile.children.cycles-pp.pick_task_fair
0.14 ± 44% +0.1 0.24 ± 15% perf-profile.children.cycles-pp.get_any_partial
0.27 ± 5% +0.1 0.37 ± 15% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.33 ± 4% +0.1 0.47 ± 22% perf-profile.children.cycles-pp.enqueue_entity
0.30 ± 4% +0.2 0.46 ± 26% perf-profile.children.cycles-pp.update_load_avg
0.48 ± 4% +0.2 0.72 ± 26% perf-profile.children.cycles-pp.enqueue_task_fair
0.51 ± 3% +0.2 0.75 ± 27% perf-profile.children.cycles-pp.enqueue_task
0.48 ± 6% +0.3 0.75 ± 23% perf-profile.children.cycles-pp.pick_next_task_fair
0.49 ± 6% +0.3 0.76 ± 24% perf-profile.children.cycles-pp.__pick_next_task
0.60 ± 4% +0.3 0.89 ± 25% perf-profile.children.cycles-pp.ttwu_do_activate
1.67 ± 2% +0.6 2.23 ± 20% perf-profile.children.cycles-pp.schedule_timeout
1.64 ± 3% +0.7 2.29 ± 21% perf-profile.children.cycles-pp.unix_stream_data_wait
1.78 ± 4% +0.7 2.53 ± 21% perf-profile.children.cycles-pp.schedule
1.78 ± 4% +0.8 2.54 ± 22% perf-profile.children.cycles-pp.__schedule
36.58 +0.8 37.42 perf-profile.children.cycles-pp.ksys_write
35.60 +0.9 36.51 perf-profile.children.cycles-pp.vfs_write
34.52 +1.0 35.49 perf-profile.children.cycles-pp.sock_write_iter
33.31 +1.0 34.36 perf-profile.children.cycles-pp.unix_stream_sendmsg
4.37 ± 2% -0.6 3.79 ± 2% perf-profile.self.cycles-pp._copy_to_iter
3.94 ± 3% -0.3 3.60 perf-profile.self.cycles-pp.clear_bhb_loop
2.27 ± 3% -0.3 1.98 perf-profile.self.cycles-pp.check_heap_object
3.29 ± 2% -0.3 3.01 ± 5% perf-profile.self.cycles-pp.__memcg_slab_free_hook
2.03 ± 4% -0.3 1.76 ± 2% perf-profile.self.cycles-pp.kmem_cache_free
2.50 ± 3% -0.2 2.25 ± 3% perf-profile.self.cycles-pp.sock_wfree
2.61 ± 2% -0.2 2.37 ± 3% perf-profile.self.cycles-pp.unix_stream_read_generic
2.30 ± 3% -0.2 2.09 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
2.37 ± 3% -0.2 2.18 ± 3% perf-profile.self.cycles-pp.__slab_free
1.04 ± 4% -0.2 0.86 ± 5% perf-profile.self.cycles-pp.skb_release_data
2.19 ± 4% -0.2 2.01 ± 2% perf-profile.self.cycles-pp.mod_objcg_state
1.31 ± 3% -0.1 1.18 perf-profile.self.cycles-pp.__kmalloc_node_track_caller_noprof
1.33 ± 3% -0.1 1.21 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.04 ± 3% -0.1 0.93 perf-profile.self.cycles-pp.kmem_cache_alloc_node_noprof
1.13 ± 3% -0.1 1.02 perf-profile.self.cycles-pp.__alloc_skb
1.38 ± 2% -0.1 1.29 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.74 ± 3% -0.1 0.66 perf-profile.self.cycles-pp.__skb_datagram_iter
1.11 ± 3% -0.1 1.03 ± 2% perf-profile.self.cycles-pp.sock_write_iter
0.80 ± 3% -0.1 0.74 ± 2% perf-profile.self.cycles-pp.write
0.60 ± 4% -0.1 0.54 perf-profile.self.cycles-pp.__build_skb_around
0.84 ± 4% -0.1 0.78 perf-profile.self.cycles-pp.sock_read_iter
0.69 ± 3% -0.1 0.64 perf-profile.self.cycles-pp.__check_heap_object
0.62 ± 3% -0.1 0.57 perf-profile.self.cycles-pp.refill_obj_stock
0.82 -0.0 0.77 perf-profile.self.cycles-pp.read
0.80 ± 3% -0.0 0.75 ± 3% perf-profile.self.cycles-pp.do_syscall_64
0.51 ± 4% -0.0 0.47 perf-profile.self.cycles-pp.__virt_addr_valid
0.46 ± 2% -0.0 0.43 ± 2% perf-profile.self.cycles-pp.kfree
0.59 ± 3% -0.0 0.56 ± 2% perf-profile.self.cycles-pp.__check_object_size
0.44 ± 2% -0.0 0.41 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.36 ± 3% -0.0 0.32 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.43 ± 2% -0.0 0.40 ± 2% perf-profile.self.cycles-pp.unix_write_space
0.37 ± 4% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.x64_sys_call
0.34 ± 3% -0.0 0.31 ± 2% perf-profile.self.cycles-pp.__cond_resched
0.29 ± 3% -0.0 0.27 perf-profile.self.cycles-pp.ksys_write
0.30 ± 2% -0.0 0.28 ± 2% perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
0.18 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.18 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.unix_destruct_scm
0.21 ± 3% -0.0 0.19 perf-profile.self.cycles-pp.__scm_recv_common
0.25 -0.0 0.23 ± 2% perf-profile.self.cycles-pp.kmalloc_reserve
0.15 ± 2% -0.0 0.14 perf-profile.self.cycles-pp.skb_unlink
0.15 ± 2% -0.0 0.14 perf-profile.self.cycles-pp.unix_scm_to_skb
0.07 ± 9% +0.0 0.10 ± 19% perf-profile.self.cycles-pp.pick_eevdf
0.09 ± 5% +0.0 0.13 ± 16% perf-profile.self.cycles-pp.__switch_to
0.11 ± 7% +0.0 0.14 ± 10% perf-profile.self.cycles-pp.__dequeue_entity
0.08 ± 5% +0.0 0.12 ± 22% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.02 ±122% +0.0 0.06 ± 17% perf-profile.self.cycles-pp.native_sched_clock
0.13 ± 8% +0.1 0.18 ± 14% perf-profile.self.cycles-pp.__enqueue_entity
0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.vruntime_eligible
0.27 ± 5% +0.1 0.37 ± 15% perf-profile.self.cycles-pp.switch_mm_irqs_off
***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/membarrier/stress-ng/60s
commit:
f553741ac8 ("sched: Cancel the slice protection of the idle entity")
2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")
f553741ac8c0e467 2ae891b826958b60919ea21c727
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.08 -0.1 0.99 mpstat.cpu.all.irq%
67.18 ± 2% -11.9% 59.20 ± 5% mpstat.max_utilization_pct
3401 ± 19% -31.4% 2332 ± 18% perf-c2c.DRAM.remote
2396 ± 3% -23.1% 1844 ± 18% perf-c2c.HITM.remote
29248 +14.3% 33418 vmstat.system.cs
788485 -9.1% 716631 vmstat.system.in
191106 -1.7% 187946 proc-vmstat.nr_anon_pages
535277 ± 2% +5.6% 565009 ± 4% proc-vmstat.numa_hit
469052 ± 2% +6.3% 498763 ± 5% proc-vmstat.numa_local
51285 ± 7% +54.3% 79119 ± 31% proc-vmstat.numa_pages_migrated
51285 ± 7% +54.3% 79119 ± 31% proc-vmstat.pgmigrate_success
16417 ± 7% +131.4% 37986 ± 78% proc-vmstat.pgreuse
505.28 -10.6% 451.92 stress-ng.membarrier.membarrier_calls_per_sec
97160 -10.5% 86939 stress-ng.membarrier.ops
1618 -10.5% 1448 stress-ng.membarrier.ops_per_sec
55094 ± 5% +277.5% 207976 ± 9% stress-ng.time.involuntary_context_switches
3195 ± 2% -8.3% 2931 stress-ng.time.percent_of_cpu_this_job_got
1921 ± 2% -8.3% 1761 stress-ng.time.system_time
1047923 +5.9% 1109900 stress-ng.time.voluntary_context_switches
5.501e+09 ± 2% -7.8% 5.074e+09 perf-stat.i.branch-instructions
30090 +14.4% 34431 perf-stat.i.context-switches
1.041e+11 ± 2% -7.6% 9.627e+10 perf-stat.i.cpu-cycles
10683 +6.7% 11402 perf-stat.i.cpu-migrations
2.73e+10 ± 2% -7.6% 2.522e+10 perf-stat.i.instructions
5.406e+09 ± 2% -7.8% 4.985e+09 perf-stat.ps.branch-instructions
29571 +14.4% 33836 perf-stat.ps.context-switches
1.024e+11 ± 2% -7.6% 9.46e+10 perf-stat.ps.cpu-cycles
10498 +6.7% 11203 perf-stat.ps.cpu-migrations
2.683e+10 ± 2% -7.6% 2.478e+10 perf-stat.ps.instructions
1.631e+12 ± 2% -7.7% 1.505e+12 perf-stat.total.instructions
698086 ± 4% -12.0% 614339 ± 3% sched_debug.cfs_rq:/.avg_vruntime.avg
918198 ± 7% -13.5% 794083 ± 6% sched_debug.cfs_rq:/.avg_vruntime.max
650282 ± 4% -12.9% 566525 ± 4% sched_debug.cfs_rq:/.avg_vruntime.min
698086 ± 4% -12.0% 614339 ± 3% sched_debug.cfs_rq:/.min_vruntime.avg
918198 ± 7% -13.5% 794083 ± 6% sched_debug.cfs_rq:/.min_vruntime.max
650282 ± 4% -12.9% 566525 ± 4% sched_debug.cfs_rq:/.min_vruntime.min
13.48 ± 36% +250.6% 47.25 ± 40% sched_debug.cfs_rq:/.removed.load_avg.avg
77.26 ± 17% +91.9% 148.27 ± 24% sched_debug.cfs_rq:/.removed.load_avg.stddev
5.08 ± 33% +246.5% 17.60 ± 35% sched_debug.cfs_rq:/.removed.runnable_avg.avg
212.33 ± 20% +30.1% 276.17 ± 7% sched_debug.cfs_rq:/.removed.runnable_avg.max
30.44 ± 21% +89.0% 57.52 ± 14% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
5.08 ± 33% +246.6% 17.60 ± 35% sched_debug.cfs_rq:/.removed.util_avg.avg
212.25 ± 21% +30.1% 276.08 ± 7% sched_debug.cfs_rq:/.removed.util_avg.max
30.43 ± 21% +89.0% 57.51 ± 14% sched_debug.cfs_rq:/.removed.util_avg.stddev
15701 +12.8% 17719 sched_debug.cpu.nr_switches.avg
11778 ± 7% +20.3% 14165 ± 8% sched_debug.cpu.nr_switches.min
-202.17 +21.0% -244.58 sched_debug.cpu.nr_uninterruptible.min
1.43 ± 36% -99.6% 0.01 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.94 ± 23% -91.9% 0.08 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
1.60 ± 68% -99.9% 0.00 ±223% perf-sched.sch_delay.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
1.39 ± 8% +71.7% 2.38 ± 7% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
1.95 ± 5% +23.7% 2.41 ± 5% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
0.89 ± 4% -16.0% 0.75 ± 3% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
0.01 ± 25% +75.0% 0.02 ± 34% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.06 ± 11% -37.5% 0.04 ± 40% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.80 ±145% +478.8% 4.62 ± 52% perf-sched.sch_delay.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
5.29 ± 41% -99.9% 0.01 ±223% perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
6.37 ± 13% -93.7% 0.40 ±223% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
2.22 ± 49% -99.9% 0.00 ±223% perf-sched.sch_delay.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
10.40 ± 13% +32.1% 13.74 ± 5% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
4.55 ± 5% -34.9% 2.96 ± 42% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.98 ± 4% +33.4% 1.30 ± 6% perf-sched.total_sch_delay.average.ms
22.34 -12.3% 19.59 perf-sched.total_wait_and_delay.average.ms
102076 +18.6% 121096 perf-sched.total_wait_and_delay.count.ms
21.37 -14.4% 18.29 perf-sched.total_wait_time.average.ms
515.07 ± 36% +63.2% 840.46 ± 16% perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
11.25 ± 5% +56.4% 17.59 ± 7% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
15.80 -13.5% 13.67 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
487.31 ± 4% +16.4% 567.38 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
8.00 ± 26% +95.8% 15.67 ± 20% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1384 ± 12% +58.1% 2188 ± 8% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
10678 ± 7% +270.1% 39521 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
85629 -12.4% 75039 ± 3% perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
2443 ± 44% -58.3% 1018 perf-sched.wait_and_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
2099 ± 55% -76.1% 501.21 perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
15.94 ± 9% -86.6% 2.13 ±223% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
515.06 ± 36% +63.2% 840.45 ± 16% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
15.25 ± 3% -85.0% 2.29 ±223% perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
427.24 ± 78% -99.6% 1.55 ±107% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
10.38 ± 53% -95.2% 0.50 ±223% perf-sched.wait_time.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
48.58 ±185% -94.2% 2.80 ± 99% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
9.86 ± 5% +54.2% 15.21 ± 7% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
14.92 -13.4% 12.92 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
1.30 ± 8% -11.1% 1.15 ± 6% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
487.30 ± 4% +16.4% 567.36 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
6.13 ±141% +268.5% 22.60 ± 17% perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
25.13 ± 9% -91.5% 2.13 ±223% perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
25.92 ± 12% -86.1% 3.61 ±223% perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
2260 ± 59% -99.9% 3.00 ±118% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
13.02 ± 43% -96.2% 0.50 ±223% perf-sched.wait_time.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
2443 ± 44% -58.3% 1018 perf-sched.wait_time.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
2097 ± 55% -76.1% 500.54 perf-sched.wait_time.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists