[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202508071007.7b2e45c0-lkp@intel.com>
Date: Thu, 7 Aug 2025 16:34:21 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Chris Mason <clm@...a.com>, Juri Lelli <juri.lelli@...hat.com>,
<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [sched/deadline] cccb45d7c4:
stress-ng.netdev.ops_per_sec 61.6% regression
Hello,
besides the regressions (and improvements) we reported as
"[tip:sched/core] [sched/deadline] cccb45d7c4: will-it-scale.per_thread_ops 36.7% regression"
in
https://lore.kernel.org/all/202507230755.5fe8e03e-lkp@intel.com/
now we captured 2 more regressions when this commit is in mainline. just FYI
kernel test robot noticed a 61.6% regression of stress-ng.netdev.ops_per_sec on:
commit: cccb45d7c4295bbfeba616582d0249f2d21e6df5 ("sched/deadline: Less agressive dl_server handling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master 7e161a991ea71e6ec526abc8f40c6852ebe3d946]
[still regression on linux-next/master afec768a6a8fe7fb02a08ffce5f2f556f51d4b52]
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: netdev
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -2.2% regression |
| test machine | 20 threads 1 sockets (Commet Lake) with 16G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=900s |
| | test=TCP_STREAM |
+------------------+----------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202508071007.7b2e45c0-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250807/202508071007.7b2e45c0-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp3/netdev/stress-ng/60s
commit:
570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")
570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.63e+08 +643.3% 2.698e+09 ± 13% cpuidle..time
204743 ± 3% +1975.4% 4249334 ± 10% cpuidle..usage
4.97 +341.0% 21.93 ± 10% vmstat.cpu.id
184.63 -22.2% 143.65 ± 4% vmstat.procs.r
3473 +3085.4% 110658 ± 8% vmstat.system.cs
408964 +2.0% 417266 vmstat.system.in
1113721 +17.7% 1310862 meminfo.Active
1113721 +17.7% 1310862 meminfo.Active(anon)
177366 ± 2% +11.0% 196822 ± 4% meminfo.DirectMap4k
298665 +19.2% 355990 ± 2% meminfo.Mapped
420393 +45.3% 610884 ± 2% meminfo.Shmem
2.96 ± 20% +16.7 19.66 ± 14% mpstat.cpu.all.idle%
0.26 +0.3 0.61 ± 3% mpstat.cpu.all.irq%
0.00 ± 20% +0.0 0.03 ± 5% mpstat.cpu.all.soft%
96.53 -17.1 79.47 ± 3% mpstat.cpu.all.sys%
14.83 ± 61% +238.2% 50.17 ± 20% mpstat.max_utilization.seconds
100.00 -14.6% 85.36 ± 3% mpstat.max_utilization_pct
9547602 -61.6% 3667923 ± 4% stress-ng.netdev.ops
159180 -61.6% 61151 ± 4% stress-ng.netdev.ops_per_sec
67355 +2.8% 69256 stress-ng.time.minor_page_faults
19016 -21.0% 15021 ± 4% stress-ng.time.percent_of_cpu_this_job_got
11432 -21.0% 9033 ± 4% stress-ng.time.system_time
35368 ± 2% +9542.0% 3410177 ± 8% stress-ng.time.voluntary_context_switches
278515 +17.5% 327222 proc-vmstat.nr_active_anon
995358 +4.7% 1042520 proc-vmstat.nr_file_pages
74999 +18.3% 88740 ± 2% proc-vmstat.nr_mapped
105146 +44.9% 152305 ± 2% proc-vmstat.nr_shmem
278515 +17.5% 327222 proc-vmstat.nr_zone_active_anon
826913 +7.7% 890858 proc-vmstat.numa_hit
629070 +10.1% 692863 proc-vmstat.numa_local
873883 +7.2% 936679 proc-vmstat.pgalloc_normal
418067 +2.9% 430228 proc-vmstat.pgfault
0.10 ± 3% +37.8% 0.14 ± 2% perf-stat.i.MPKI
2.248e+10 -22.9% 1.733e+10 ± 4% perf-stat.i.branch-instructions
0.10 ± 2% +0.0 0.15 ± 4% perf-stat.i.branch-miss-rate%
18947128 +20.6% 22857416 perf-stat.i.branch-misses
35.42 -17.1 18.35 ± 9% perf-stat.i.cache-miss-rate%
9364646 +11.9% 10482390 ± 2% perf-stat.i.cache-misses
27205535 +125.0% 61210467 ± 11% perf-stat.i.cache-references
3273 ± 2% +3392.2% 114320 ± 8% perf-stat.i.context-switches
5.35 +3.8% 5.56 perf-stat.i.cpi
6.028e+11 -20.1% 4.818e+11 ± 4% perf-stat.i.cpu-cycles
327.85 +343.1% 1452 ± 9% perf-stat.i.cpu-migrations
68905 -27.8% 49741 ± 2% perf-stat.i.cycles-between-cache-misses
1.12e+11 -23.0% 8.626e+10 ± 4% perf-stat.i.instructions
0.19 -3.5% 0.18 perf-stat.i.ipc
4316 ± 2% +6.1% 4578 perf-stat.i.minor-faults
4316 ± 2% +6.1% 4578 perf-stat.i.page-faults
0.08 +45.3% 0.12 ± 2% perf-stat.overall.MPKI
0.08 +0.0 0.13 ± 5% perf-stat.overall.branch-miss-rate%
34.42 -17.1 17.28 ± 9% perf-stat.overall.cache-miss-rate%
5.38 +3.8% 5.59 perf-stat.overall.cpi
64384 -28.5% 46017 ± 2% perf-stat.overall.cycles-between-cache-misses
0.19 -3.7% 0.18 perf-stat.overall.ipc
2.211e+10 -22.9% 1.705e+10 ± 4% perf-stat.ps.branch-instructions
18642811 +20.5% 22455956 perf-stat.ps.branch-misses
9210208 +11.8% 10296398 ± 2% perf-stat.ps.cache-misses
26761745 +124.9% 60190009 ± 11% perf-stat.ps.cache-references
3220 ± 2% +3391.4% 112425 ± 8% perf-stat.ps.context-switches
5.93e+11 -20.1% 4.739e+11 ± 4% perf-stat.ps.cpu-cycles
322.54 +343.0% 1428 ± 9% perf-stat.ps.cpu-migrations
1.102e+11 -23.0% 8.484e+10 ± 4% perf-stat.ps.instructions
4239 ± 2% +5.3% 4464 perf-stat.ps.minor-faults
4239 ± 2% +5.3% 4464 perf-stat.ps.page-faults
6.771e+12 -23.7% 5.169e+12 ± 4% perf-stat.total.instructions
5992277 -35.8% 3846765 ± 8% sched_debug.cfs_rq:/.avg_vruntime.avg
6049811 -19.2% 4888185 ± 5% sched_debug.cfs_rq:/.avg_vruntime.max
5847973 -63.4% 2140155 ± 6% sched_debug.cfs_rq:/.avg_vruntime.min
30248 ± 13% +3774.5% 1171963 ± 2% sched_debug.cfs_rq:/.avg_vruntime.stddev
0.53 -21.2% 0.42 ± 4% sched_debug.cfs_rq:/.h_nr_queued.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_queued.min
0.17 ± 10% +99.4% 0.34 ± 3% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.53 -21.3% 0.42 ± 4% sched_debug.cfs_rq:/.h_nr_runnable.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_runnable.min
0.17 ± 10% +98.9% 0.34 ± 3% sched_debug.cfs_rq:/.h_nr_runnable.stddev
2696 -100.0% 0.00 sched_debug.cfs_rq:/.load.min
2.50 -83.3% 0.42 ±107% sched_debug.cfs_rq:/.load_avg.min
5992277 -35.8% 3846765 ± 8% sched_debug.cfs_rq:/.min_vruntime.avg
6049811 -19.2% 4888185 ± 5% sched_debug.cfs_rq:/.min_vruntime.max
5847973 -63.4% 2140155 ± 6% sched_debug.cfs_rq:/.min_vruntime.min
30248 ± 13% +3774.5% 1171963 ± 2% sched_debug.cfs_rq:/.min_vruntime.stddev
0.53 -21.2% 0.42 ± 4% sched_debug.cfs_rq:/.nr_queued.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.nr_queued.min
0.12 ± 8% +185.0% 0.33 ± 4% sched_debug.cfs_rq:/.nr_queued.stddev
588.21 -17.7% 484.36 ± 3% sched_debug.cfs_rq:/.runnable_avg.avg
489.25 ± 6% -95.0% 24.25 ±141% sched_debug.cfs_rq:/.runnable_avg.min
136.65 ± 9% +70.5% 233.00 ± 4% sched_debug.cfs_rq:/.runnable_avg.stddev
585.65 -17.5% 482.95 ± 3% sched_debug.cfs_rq:/.util_avg.avg
410.58 ± 29% -94.4% 23.00 ±141% sched_debug.cfs_rq:/.util_avg.min
117.24 ± 7% +99.4% 233.84 ± 4% sched_debug.cfs_rq:/.util_avg.stddev
520.05 -32.1% 353.31 ± 6% sched_debug.cfs_rq:/.util_est.avg
1139 ± 14% -19.1% 921.17 ± 11% sched_debug.cfs_rq:/.util_est.max
387.58 ± 45% -100.0% 0.00 sched_debug.cfs_rq:/.util_est.min
67.01 ± 18% +283.1% 256.74 ± 2% sched_debug.cfs_rq:/.util_est.stddev
669274 ± 19% +60.4% 1073556 ± 6% sched_debug.cpu.avg_idle.avg
1885708 ± 21% +44.0% 2714848 ± 7% sched_debug.cpu.avg_idle.max
7213 ± 87% +597.3% 50301 ± 5% sched_debug.cpu.avg_idle.min
16.82 ± 12% -35.5% 10.86 ± 8% sched_debug.cpu.clock.stddev
2573 -21.9% 2010 ± 4% sched_debug.cpu.curr->pid.avg
2303 ± 12% -100.0% 0.00 sched_debug.cpu.curr->pid.min
448.26 ± 8% +220.4% 1436 ± 4% sched_debug.cpu.curr->pid.stddev
235851 ± 10% +17.7% 277484 ± 9% sched_debug.cpu.max_idle_balance_cost.stddev
0.53 -21.5% 0.42 ± 4% sched_debug.cpu.nr_running.avg
0.50 -100.0% 0.00 sched_debug.cpu.nr_running.min
0.16 ± 11% +108.2% 0.33 ± 3% sched_debug.cpu.nr_running.stddev
1673 ± 16% +1028.2% 18878 ± 7% sched_debug.cpu.nr_switches.avg
2564 ± 67% +752.3% 21858 ± 4% sched_debug.cpu.nr_switches.stddev
0.00 ±111% +6575.0% 0.12 ± 14% sched_debug.cpu.nr_uninterruptible.avg
-31.42 +179.6% -87.83 sched_debug.cpu.nr_uninterruptible.min
0.03 ± 7% +349.1% 0.12 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.03 ± 99% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.08 ±107% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.01 ± 8% +78.1% 0.01 ± 17% perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.79 ± 29% -99.2% 0.01 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.60 ± 65% -92.2% 0.05 ±192% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ± 70% -71.0% 0.01 ± 36% perf-sched.sch_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
0.23 ± 26% -90.3% 0.02 ± 36% perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.50 ± 42% -86.5% 0.07 ±120% perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
0.05 ± 45% -83.5% 0.01 ± 6% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.06 ± 28% -73.9% 0.02 ± 35% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ± 10% +136.7% 0.02 ± 46% perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.26 ± 29% -96.0% 0.01 ± 85% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
0.30 ± 40% -95.2% 0.01 ± 21% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
0.31 ± 41% -95.3% 0.01 ± 20% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
0.20 ± 39% -96.3% 0.01 ± 25% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.01 +73.3% 0.01 ± 14% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
0.01 ± 32% -39.7% 0.01 ± 6% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.21 ± 22% -96.2% 0.01 ± 22% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.27 ±139% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.58 ± 90% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.01 ± 21% +204.2% 0.02 ± 75% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
3.39 ± 8% -99.7% 0.01 ± 7% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
2.08 ± 55% -92.1% 0.16 ±210% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
3.41 ± 7% -90.9% 0.31 ± 97% perf-sched.sch_delay.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
6.55 ± 39% -91.8% 0.54 ± 56% perf-sched.sch_delay.max.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
2.46 ± 39% -98.1% 0.05 ± 73% perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.88 ± 49% -92.6% 0.44 ± 88% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ± 21% +433.9% 0.06 ± 78% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
6.90 ± 37% +311.9% 28.44 ± 8% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
9.31 ± 16% +223.8% 30.16 ± 11% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
3.20 ± 6% -97.0% 0.10 ± 75% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.01 ± 8% +174.5% 0.02 ± 47% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
1.83 ± 20% -95.6% 0.08 ± 21% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
2.67 ± 38% -98.4% 0.04 ± 82% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.18 ± 37% -91.7% 0.01 ± 18% perf-sched.total_sch_delay.average.ms
10.22 +205.5% 31.22 ± 9% perf-sched.total_sch_delay.max.ms
108.64 ± 6% -93.1% 7.50 ± 3% perf-sched.total_wait_and_delay.average.ms
12100 ± 7% +1924.9% 245027 ± 4% perf-sched.total_wait_and_delay.count.ms
4980 -18.5% 4056 ± 8% perf-sched.total_wait_and_delay.max.ms
108.47 ± 6% -93.1% 7.48 ± 3% perf-sched.total_wait_time.average.ms
4980 -18.5% 4056 ± 8% perf-sched.total_wait_time.max.ms
7.85 -92.1% 0.62 ±223% perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
589.78 ± 7% +28.3% 756.97 perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
1.16 ± 28% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
500.89 -74.7% 126.73 ± 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
1.20 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
6.80 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
47.83 ± 7% -22.6% 37.00 perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
109.67 ± 3% -100.0% 0.00 perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
602.00 ± 48% -93.8% 37.33 ±223% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
24.00 +438.9% 129.33 ± 15% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
3099 ± 8% +3753.2% 119435 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
3170 ± 8% +3674.8% 119693 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
86.33 -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
712.00 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
1707 ± 2% +48.8% 2540 ± 19% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
449.67 ± 4% +20.9% 543.67 ± 5% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4980 -96.7% 166.80 ±223% perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
5.85 ± 18% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.96 ± 35% +145.9% 34.34 ± 34% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
18.63 ± 16% +81.2% 33.75 ± 25% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
7.17 ± 12% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
479.50 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
7.82 -50.2% 3.90 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.25 ±140% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.06 ±147% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
589.55 ± 7% +28.4% 756.94 perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.20 ±151% +304.7% 0.82 ± 29% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1.10 ± 31% -70.5% 0.33 ± 9% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
500.62 -74.7% 126.72 ± 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
0.39 ± 43% +156.4% 1.00 ± 17% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
0.40 ± 39% +150.1% 1.01 ± 17% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
1.00 ± 8% -41.3% 0.58 ± 5% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.21 ± 24% -98.1% 0.00 ± 38% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
4980 -79.9% 1000 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
1.15 ±108% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.44 ± 96% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.41 ±152% +306.1% 1.65 ± 29% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
3.88 ± 7% -66.4% 1.30 ± 27% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.36 ± 5% -61.1% 2.09 ± 4% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
2.67 ± 38% -98.7% 0.03 ± 71% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
***************************************************************************************************
lkp-cml-d02: 20 threads 1 sockets (Commet Lake) with 16G memory
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-9.4/200%/debian-12-x86_64-20240206.cgz/900s/lkp-cml-d02/TCP_STREAM/netperf
commit:
570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")
570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0
---------------- ---------------------------
%stddev %change %stddev
\ | \
8399 +29.1% 10840 ± 13% vmstat.system.cs
974775 ± 5% +12.9% 1100703 ± 12% sched_debug.cpu.avg_idle.max
8.99 ± 3% +8.1% 9.72 ± 6% sched_debug.cpu.clock.stddev
208144 +24.9% 260012 ± 12% sched_debug.cpu.nr_switches.avg
166255 +16.0% 192901 ± 11% sched_debug.cpu.nr_switches.stddev
31290 +3.7% 32458 ± 2% proc-vmstat.nr_shmem
2.324e+08 -2.1% 2.274e+08 proc-vmstat.numa_hit
2.324e+08 -2.1% 2.274e+08 proc-vmstat.numa_local
1.851e+09 -2.1% 1.812e+09 proc-vmstat.pgalloc_normal
1.851e+09 -2.1% 1.812e+09 proc-vmstat.pgfree
1683 -2.2% 1647 netperf.ThroughputBoth_Mbps
67336 -2.2% 65887 netperf.ThroughputBoth_total_Mbps
1683 -2.2% 1647 netperf.Throughput_Mbps
67336 -2.2% 65887 netperf.Throughput_total_Mbps
2006974 +41.4% 2838711 ± 16% netperf.time.involuntary_context_switches
4.624e+08 -2.2% 4.524e+08 netperf.workload
117.25 +1.7% 119.19 perf-stat.i.MPKI
6.963e+08 -1.5% 6.858e+08 perf-stat.i.branch-instructions
8356 +29.1% 10785 ± 13% perf-stat.i.context-switches
25.44 +1.5% 25.82 perf-stat.i.cpi
117.06 +2.5% 119.93 perf-stat.i.cpu-migrations
3.35e+09 -1.5% 3.3e+09 perf-stat.i.instructions
0.04 -1.5% 0.04 perf-stat.i.ipc
115.09 +1.7% 117.03 perf-stat.overall.MPKI
24.97 +1.5% 25.35 perf-stat.overall.cpi
0.04 -1.5% 0.04 perf-stat.overall.ipc
6.954e+08 -1.5% 6.849e+08 perf-stat.ps.branch-instructions
8346 +29.1% 10773 ± 13% perf-stat.ps.context-switches
116.93 +2.5% 119.85 perf-stat.ps.cpu-migrations
3.346e+09 -1.5% 3.296e+09 perf-stat.ps.instructions
3.021e+12 -1.5% 2.976e+12 perf-stat.total.instructions
7.77 ± 4% -24.0% 5.91 ± 16% perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
5.10 ± 14% -19.5% 4.11 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
6.21 ± 6% -21.0% 4.91 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
6.78 ± 12% -23.9% 5.16 ± 12% perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
6.41 ± 3% -21.6% 5.03 ± 15% perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
1.14 ±119% +155.9% 2.90 ± 51% perf-sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.91 -21.7% 5.41 ± 13% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
4.82 ± 18% -35.7% 3.10 ± 17% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
7.39 ± 4% -25.4% 5.51 ± 13% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
1.65 ± 24% +33.5% 2.21 ± 16% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
5.41 -17.6% 4.46 ± 11% perf-sched.total_sch_delay.average.ms
18.99 -16.5% 15.86 ± 11% perf-sched.total_wait_and_delay.average.ms
48294 +23.0% 59391 ± 12% perf-sched.total_wait_and_delay.count.ms
13.58 -16.0% 11.40 ± 10% perf-sched.total_wait_time.average.ms
15.56 ± 4% -24.0% 11.83 ± 16% perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
102.17 ± 8% -28.6% 72.96 ± 30% perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
13.81 ± 5% -20.3% 11.01 ± 12% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
14.64 -21.1% 11.55 ± 13% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
641.16 ± 6% +7.6% 689.79 ± 5% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4579 ± 6% +44.6% 6622 ± 25% perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
101.80 ± 3% +12.0% 114.00 ± 8% perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
35.20 ± 15% +35.4% 47.67 ± 29% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
23823 +31.1% 31228 ± 15% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
1456 ± 3% +28.9% 1878 ± 18% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
764.24 ± 18% -40.2% 457.29 ± 29% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
7.79 ± 4% -24.0% 5.92 ± 16% perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
7.60 ± 5% -19.7% 6.10 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
6.81 ± 12% -23.3% 5.22 ± 13% perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
8.41 ± 3% -22.1% 6.55 ± 14% perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
7.73 -20.6% 6.14 ± 12% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.04 ± 17% -28.6% 7.17 ± 9% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
641.09 ± 6% +7.5% 689.00 ± 5% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
764.17 ± 18% -40.2% 457.17 ± 29% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
41.85 -0.3 41.58 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
0.64 ± 8% -0.3 0.38 ± 71% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
0.62 ± 3% -0.2 0.38 ± 70% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
0.60 ± 3% -0.2 0.37 ± 70% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
34.37 -0.2 34.15 perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
33.61 -0.2 33.39 perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
34.89 -0.2 34.74 perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
0.76 ± 7% -0.1 0.64 ± 12% perf-profile.calltrace.cycles-pp.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
99.42 -0.1 99.36 perf-profile.calltrace.cycles-pp.main
2.42 ± 2% +0.1 2.49 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
2.43 ± 2% +0.1 2.50 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
3.86 ± 2% +0.1 3.96 perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
3.88 +0.1 3.98 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
3.88 +0.1 3.98 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
3.99 ± 2% +0.1 4.10 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs
4.02 ± 2% +0.1 4.13 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq
4.02 ± 2% +0.1 4.13 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip
54.53 +0.2 54.68 perf-profile.calltrace.cycles-pp.recv.recv_omni.process_requests.spawn_child.accept_connection
54.44 +0.2 54.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests
54.44 +0.2 54.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests.spawn_child
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections
41.87 -0.3 41.60 perf-profile.children.cycles-pp.tcp_sendmsg_locked
34.39 -0.2 34.16 perf-profile.children.cycles-pp._copy_from_iter
34.91 -0.2 34.76 perf-profile.children.cycles-pp.skb_do_copy_data_nocache
1.38 ± 4% -0.1 1.25 ± 4% perf-profile.children.cycles-pp.napi_consume_skb
0.86 ± 3% -0.1 0.74 ± 12% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
3.55 -0.1 3.43 ± 2% perf-profile.children.cycles-pp.skb_release_data
99.55 -0.1 99.50 perf-profile.children.cycles-pp.main
98.61 -0.0 98.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.06 ± 6% +0.0 0.08 ± 14% perf-profile.children.cycles-pp.switch_fpu_return
0.33 ± 3% +0.1 0.38 ± 8% perf-profile.children.cycles-pp.schedule
0.35 ± 2% +0.1 0.41 ± 9% perf-profile.children.cycles-pp.exit_to_user_mode_loop
0.39 ± 4% +0.1 0.46 ± 11% perf-profile.children.cycles-pp.__schedule
54.77 +0.2 54.94 perf-profile.children.cycles-pp.accept_connection
54.77 +0.2 54.94 perf-profile.children.cycles-pp.accept_connections
54.77 +0.2 54.94 perf-profile.children.cycles-pp.process_requests
54.77 +0.2 54.94 perf-profile.children.cycles-pp.spawn_child
54.77 +0.2 54.94 perf-profile.children.cycles-pp.recv_omni
54.63 +0.2 54.80 perf-profile.children.cycles-pp.recv
0.86 ± 3% -0.1 0.74 ± 12% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.12 -0.0 1.09 perf-profile.self.cycles-pp.__free_frozen_pages
0.15 ± 2% +0.0 0.18 ± 12% perf-profile.self.cycles-pp.__rmqueue_pcplist
0.22 ± 11% +0.0 0.27 ± 6% perf-profile.self.cycles-pp.__check_object_size
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists