[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8879e26233491e21880de050b3f63bd8c764cd2e.camel@redhat.com>
Date: Wed, 25 Jun 2025 17:06:49 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, kernel test robot
<oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, aubrey.li@...ux.intel.com,
yu.c.chen@...el.com, Andrew Morton <akpm@...ux-foundation.org>, David
Hildenbrand <david@...hat.com>, Ingo Molnar <mingo@...hat.com>, Peter
Zijlstra <peterz@...radead.org>, "Paul E. McKenney" <paulmck@...nel.org>,
Ingo Molnar <mingo@...hat.org>
Subject: Re: [RESEND PATCH v13 2/3] sched: Move task_mm_cid_work to mm
work_struct
On Wed, 2025-06-25 at 09:57 -0400, Mathieu Desnoyers wrote:
> On 2025-06-25 04:01, kernel test robot wrote:
> >
> > Hello,
> >
> > kernel test robot noticed a 10.1% regression of
> > hackbench.throughput on:
>
> Hi Gabriele,
>
> This is a significant regression. Can you investigate before it gets
> merged ?
>
Hi Mathieu,
I'll have a closer look at this next week.
For now let's keep this stalled.
Thanks,
Gabriele
> Thanks,
>
> Mathieu
>
> >
> >
> > commit: f3de761c52148abfb1b4512914f64c7e1c737fc8 ("[RESEND PATCH
> > v13 2/3] sched: Move task_mm_cid_work to mm work_struct")
> > url:
> > https://github.com/intel-lab-lkp/linux/commits/Gabriele-Monaco/sched-Add-prev_sum_exec_runtime-support-for-RT-DL-and-SCX-classes/20250613-171504
> > patch link:
> > https://lore.kernel.org/all/20250613091229.21500-3-gmonaco@redhat.com/
> > patch subject: [RESEND PATCH v13 2/3] sched: Move task_mm_cid_work
> > to mm work_struct
> >
> > testcase: hackbench
> > config: x86_64-rhel-9.4
> > compiler: gcc-12
> > test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU
> > @ 2.00GHz (Ice Lake) with 256G memory
> > parameters:
> >
> > nr_threads: 100%
> > iterations: 4
> > mode: process
> > ipc: pipe
> > cpufreq_governor: performance
> >
> >
> > In addition to that, the commit also has significant impact on the
> > following tests:
> >
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | hackbench: hackbench.throughput 2.9%
> > > regression |
> > > test machine | 128 threads 2 sockets Intel(R) Xeon(R) Gold
> > > 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > ipc=socket
> > > |
> > > |
> > > iterations=4
> > > |
> > > |
> > > mode=process
> > > |
> > > |
> > > nr_threads=50%
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | aim9: aim9.shell_rtns_3.ops_per_sec 1.7%
> > > regression |
> > > test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-
> > > 2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > test=shell_rtns_3
> > > |
> > > |
> > > testtime=300s
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | hackbench: hackbench.throughput 6.2%
> > > regression |
> > > test machine | 128 threads 2 sockets Intel(R) Xeon(R) Gold
> > > 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > ipc=pipe
> > > |
> > > |
> > > iterations=4
> > > |
> > > |
> > > mode=process
> > > |
> > > |
> > > nr_threads=800%
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | aim9: aim9.shell_rtns_1.ops_per_sec 2.1%
> > > regression |
> > > test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-
> > > 2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > test=shell_rtns_1
> > > |
> > > |
> > > testtime=300s
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | hackbench: hackbench.throughput 11.8%
> > > improvement |
> > > test machine | 128 threads 2 sockets Intel(R) Xeon(R) Gold
> > > 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > ipc=pipe
> > > |
> > > |
> > > iterations=4
> > > |
> > > |
> > > mode=process
> > > |
> > > |
> > > nr_threads=50%
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | aim9: aim9.shell_rtns_2.ops_per_sec 2.2%
> > > regression |
> > > test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-
> > > 2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > test=shell_rtns_2
> > > |
> > > |
> > > testtime=300s
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | aim9: aim9.exec_test.ops_per_sec 2.6%
> > > regression |
> > > test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-
> > > 2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > test=exec_test
> > > |
> > > |
> > > testtime=300s
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> > > testcase: change | aim7: aim7.jobs-per-min 5.5%
> > > regression
> > > |
> > > test machine | 128 threads 2 sockets Intel(R) Xeon(R) Gold
> > > 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory |
> > > test parameters |
> > > cpufreq_governor=performance
> > > |
> > > |
> > > disk=1BRD_48G
> > > |
> > > |
> > > fs=xfs
> > > |
> > > |
> > > load=600
> > > |
> > > |
> > > test=sync_disk_rw
> > > |
> > +------------------+-----------------------------------------------
> > -------------------------------------------------+
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a
> > new version of
> > the same patch/commit), kindly add following tags
> > > Reported-by: kernel test robot <oliver.sang@...el.com>
> > > Closes:
> > > https://lore.kernel.org/oe-lkp/202506251555.de6720f7-lkp@intel.com
> >
> >
> > Details are as below:
> > -------------------------------------------------------------------
> > ------------------------------->
> >
> >
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20250625/202506251555.de6720f7-lkp@intel.com
> >
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/ro
> > otfs/tbox_group/testcase:
> > gcc-12/performance/pipe/4/x86_64-rhel-9.4/process/100%/debian-
> > 12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 55140 ± 80% +229.2% 181547 ± 20% numa-
> > meminfo.node1.Mapped
> > 13048 ± 80% +248.2% 45431 ± 20% numa-
> > vmstat.node1.nr_mapped
> > 679.17 ± 22% -25.3% 507.33 ± 10%
> > sched_debug.cfs_rq:/.util_est.max
> > 4.287e+08 ± 3% +20.3% 5.158e+08 cpuidle..time
> > 2953716 ± 13% +228.9% 9716185 ± 2% cpuidle..usage
> > 91072 ± 12% +134.8% 213855 ± 7% meminfo.Mapped
> > 8848637 +10.4% 9769875 ± 5% meminfo.Memused
> > 0.67 ± 4% +0.1 0.78 ± 2% mpstat.cpu.all.irq%
> > 0.03 ± 2% +0.0 0.03 ± 4% mpstat.cpu.all.soft%
> > 4.17 ± 8% +596.0% 29.00 ± 31%
> > mpstat.max_utilization.seconds
> > 2950 -12.3% 2587 vmstat.procs.r
> > 4557607 ± 2% +35.9% 6192548 vmstat.system.cs
> > 397195 ± 5% +73.4% 688726 vmstat.system.in
> > 1490153 -10.1% 1339340 hackbench.throughput
> > 1424170 -8.7% 1299590
> > hackbench.throughput_avg
> > 1490153 -10.1% 1339340
> > hackbench.throughput_best
> > 1353181 ± 2% -10.1% 1216523
> > hackbench.throughput_worst
> > 53158738 ± 3% +34.0% 71240022
> > hackbench.time.involuntary_context_switches
> > 12177 -2.4% 11891
> > hackbench.time.percent_of_cpu_this_job_got
> > 4482 +7.6% 4821
> > hackbench.time.system_time
> > 798.92 +2.0% 815.24
> > hackbench.time.user_time
> > 1.54e+08 ± 3% +46.6% 2.257e+08
> > hackbench.time.voluntary_context_switches
> > 210335 +3.3% 217333 proc-
> > vmstat.nr_anon_pages
> > 23353 ± 14% +136.2% 55152 ± 7% proc-
> > vmstat.nr_mapped
> > 61825 ± 3% +6.6% 65928 ± 2% proc-
> > vmstat.nr_page_table_pages
> > 30859 +4.4% 32213 proc-
> > vmstat.nr_slab_reclaimable
> > 1294 ±177% +1657.1% 22743 ± 66% proc-
> > vmstat.numa_hint_faults
> > 1153 ±198% +1597.0% 19566 ± 79% proc-
> > vmstat.numa_hint_faults_local
> > 1.242e+08 -3.2% 1.202e+08 proc-vmstat.numa_hit
> > 1.241e+08 -3.2% 1.201e+08 proc-
> > vmstat.numa_local
> > 2195 ±110% +2337.0% 53508 ± 55% proc-
> > vmstat.numa_pte_updates
> > 1.243e+08 -3.2% 1.203e+08 proc-
> > vmstat.pgalloc_normal
> > 875909 ± 2% +8.6% 951378 ± 2% proc-vmstat.pgfault
> > 1.231e+08 -3.5% 1.188e+08 proc-vmstat.pgfree
> > 6.903e+10 -5.6% 6.514e+10 perf-stat.i.branch-
> > instructions
> > 0.21 +0.0 0.26 perf-stat.i.branch-
> > miss-rate%
> > 89225177 ± 2% +38.3% 1.234e+08 perf-stat.i.branch-
> > misses
> > 25.64 ± 2% -5.7 19.95 ± 2% perf-stat.i.cache-
> > miss-rate%
> > 9.322e+08 ± 2% +22.8% 1.145e+09 perf-stat.i.cache-
> > references
> > 4553621 ± 2% +39.8% 6363761 perf-stat.i.context-
> > switches
> > 1.12 +4.5% 1.17 perf-stat.i.cpi
> > 186890 ± 2% +143.9% 455784 perf-stat.i.cpu-
> > migrations
> > 2.787e+11 -4.9% 2.649e+11 perf-
> > stat.i.instructions
> > 0.91 -4.4% 0.87 perf-stat.i.ipc
> > 36.79 ± 2% +44.9% 53.30 perf-
> > stat.i.metric.K/sec
> > 0.13 ± 2% +0.1 0.19 perf-
> > stat.overall.branch-miss-rate%
> > 24.44 ± 2% -4.7 19.74 ± 2% perf-
> > stat.overall.cache-miss-rate%
> > 1.12 +4.6% 1.17 perf-
> > stat.overall.cpi
> > 0.89 -4.4% 0.85 perf-
> > stat.overall.ipc
> > 6.755e+10 -5.4% 6.392e+10 perf-stat.ps.branch-
> > instructions
> > 87121352 ± 2% +38.5% 1.206e+08 perf-stat.ps.branch-
> > misses
> > 9.098e+08 ± 2% +23.1% 1.12e+09 perf-stat.ps.cache-
> > references
> > 4443812 ± 2% +39.9% 6218298 perf-
> > stat.ps.context-switches
> > 181595 ± 2% +144.5% 443985 perf-stat.ps.cpu-
> > migrations
> > 2.727e+11 -4.7% 2.599e+11 perf-
> > stat.ps.instructions
> > 1.21e+13 +4.3% 1.262e+13 perf-
> > stat.total.instructions
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.__intel_pmu_enable_all.ctx_resched.event_function.remote_functio
> > n.generic_exec_single
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioc
> > tl.perf_evsel__run_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp._perf_ioctl.perf_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCA
> > LL_64_after_hwframe
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.ctx_resched.event_function.remote_function.generic_exec_single.s
> > mp_call_function_single
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.perf_evsel__r
> > un_ioctl.perf_evsel__enable_cpu
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.entry_SYSCALL_64_after_hwframe.ioctl.perf_evsel__run_ioctl.perf_
> > evsel__enable_cpu.__evlist__enable
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.event_function.remote_function.generic_exec_single.smp_call_func
> > tion_single.event_function_call
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.event_function_call.perf_event_for_each_child._perf_ioctl.perf_i
> > octl.__x64_sys_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.generic_exec_single.smp_call_function_single.event_function_call
> > .perf_event_for_each_child._perf_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.ioctl.perf_evsel__run_ioctl.perf_evsel__enable_cpu.__evlist__ena
> > ble.__cmd_record
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.perf_event_for_each_child._perf_ioctl.perf_ioctl.__x64_sys_ioctl
> > .do_syscall_64
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.perf_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_
> > hwframe.ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.remote_function.generic_exec_single.smp_call_function_single.eve
> > nt_function_call.perf_event_for_each_child
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.calltrace.cycles-
> > pp.smp_call_function_single.event_function_call.perf_event_for_each
> > _child._perf_ioctl.perf_ioctl
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.__cmd_record.cmd_record.perf_c2c__record.run_builtin.handle_inte
> > rnal_command
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.__evlist__enable.__cmd_record.cmd_record.perf_c2c__record.run_bu
> > iltin
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.cmd_record.perf_c2c__record.run_builtin.handle_internal_command.
> > main
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.perf_c2c__record.run_builtin.handle_internal_command.main
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.perf_evsel__enable_cpu.__evlist__enable.__cmd_record.cmd_record.
> > perf_c2c__record
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.calltrace.cycles-
> > pp.perf_evsel__run_ioctl.perf_evsel__enable_cpu.__evlist__enable.__
> > cmd_record.cmd_record
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.__intel_pmu_enable_all
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.__x64_sys_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp._perf_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.ctx_resched
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.event_function
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.generic_exec_single
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.perf_event_for_each_child
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.perf_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.children.cycles-pp.remote_function
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.children.cycles-pp.__evlist__enable
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.children.cycles-pp.perf_c2c__record
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.children.cycles-pp.perf_evsel__enable_cpu
> > 11.84 ± 91% -8.4 3.49 ±154% perf-
> > profile.children.cycles-pp.perf_evsel__run_ioctl
> > 11.84 ± 91% -9.5 2.30 ±141% perf-
> > profile.self.cycles-pp.__intel_pmu_enable_all
> > 23.74 ±185% -98.6% 0.34 ±114% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.folio_alloc_mpol_noprof.shmem_alloc_folio
> > 12.77 ± 80% -83.9% 2.05 ±138% perf-
> > sched.sch_delay.avg.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_
> > exit
> > 5.93 ± 69% -90.5% 0.56 ±105% perf-
> > sched.sch_delay.avg.ms.__cond_resched.generic_perform_write.shmem_f
> > ile_write_iter.vfs_write.ksys_write
> > 6.70 ±152% -94.5% 0.37 ±145% perf-
> > sched.sch_delay.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem
> > _alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> > 0.82 ± 85% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 8.59 ±202% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 13.53 ± 11% -47.0% 7.18 ± 76% perf-
> > sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_6
> > 4
> > 15.63 ± 17% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_fa
> > ult.__do_fault
> > 47.22 ± 77% -85.5% 6.87 ±144% perf-
> > sched.sch_delay.max.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_
> > exit
> > 133.35 ±132% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 68.01 ±203% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 13.53 ± 11% -47.0% 7.18 ± 76% perf-
> > sched.sch_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_6
> > 4
> > 34.59 ± 3% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.filemap_fa
> > ult.__do_fault
> > 40.97 ± 8% -71.8% 11.55 ± 64% perf-
> > sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule
> > _timeout.constprop.0.do_poll
> > 373.07 ±123% -99.8% 0.78 ±156% perf-
> > sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 13.53 ± 11% -62.0% 5.14 ±107% perf-
> > sched.wait_and_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_sysc
> > all_64
> > 120.97 ± 23% -100.0% 0.00 perf-
> > sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.filem
> > ap_fault.__do_fault
> > 46.03 ± 30% -62.5% 17.27 ± 87% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_r
> > eschedule_ipi.[unknown].[unknown]
> > 984.50 ± 14% -43.5% 556.24 ± 58% perf-
> > sched.wait_and_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_
> > from_fork
> > 339.42 ± 12% -97.3% 9.11 ± 54% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 8.00 ± 23% -85.4% 1.17 ±223% perf-
> > sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread
> > .ret_from_fork.ret_from_fork_asm
> > 22.17 ± 49% -100.0% 0.00 perf-
> > sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.filema
> > p_fault.__do_fault
> > 73.83 ± 20% -76.3% 17.50 ± 96% perf-
> > sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_exc_page_
> > fault.[unknown].[unknown]
> > 13.53 ± 11% -62.0% 5.14 ±107% perf-
> > sched.wait_and_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_sysc
> > all_64
> > 336.30 ± 5% -100.0% 0.00 perf-
> > sched.wait_and_delay.max.ms.io_schedule.folio_wait_bit_common.filem
> > ap_fault.__do_fault
> > 23.74 ±185% -98.6% 0.34 ±114% perf-
> > sched.wait_time.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.folio_alloc_mpol_noprof.shmem_alloc_folio
> > 14.48 ± 61% -74.1% 3.76 ±152% perf-
> > sched.wait_time.avg.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_
> > exit
> > 6.48 ± 68% -91.3% 0.56 ±105% perf-
> > sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_f
> > ile_write_iter.vfs_write.ksys_write
> > 6.70 ±152% -94.5% 0.37 ±145% perf-
> > sched.wait_time.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem
> > _alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> > 2.18 ± 75% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 10.79 ±165% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 1.53 ±100% -97.5% 0.04 ± 84% perf-
> > sched.wait_time.avg.ms.__cond_resched.wp_page_copy.__handle_mm_faul
> > t.handle_mm_fault.do_user_addr_fault
> > 105.34 ± 26% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.filemap_fa
> > ult.__do_fault
> > 29.72 ± 40% -76.5% 7.00 ±102% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_faul
> > t.[unknown].[unknown]
> > 32.21 ± 33% -65.7% 11.04 ± 85% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown].[unknown]
> > 984.49 ± 14% -43.5% 556.23 ± 58% perf-
> > sched.wait_time.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_
> > fork
> > 337.00 ± 12% -97.6% 8.11 ± 52% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 53.42 ± 59% -69.8% 16.15 ±162% perf-
> > sched.wait_time.max.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_
> > exit
> > 218.65 ± 83% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 82.52 ±162% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 10.89 ± 98% -98.8% 0.13 ±134% perf-
> > sched.wait_time.max.ms.__cond_resched.wp_page_copy.__handle_mm_faul
> > t.handle_mm_fault.do_user_addr_fault
> > 334.02 ± 6% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.io_schedule.folio_wait_bit_common.filemap_fa
> > ult.__do_fault
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU
> > @ 2.00GHz (Ice Lake) with 256G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/ro
> > otfs/tbox_group/testcase:
> > gcc-12/performance/socket/4/x86_64-rhel-9.4/process/50%/debian-
> > 12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 161258 -12.6% 141018 ± 5% perf-c2c.HITM.total
> > 6514 ± 3% +13.3% 7381 ± 3% uptime.idle
> > 692218 +17.8% 815512 vmstat.system.in
> > 4.747e+08 ± 7% +137.3% 1.127e+09 ± 21% cpuidle..time
> > 5702271 ± 12% +503.6% 34419686 ± 13% cpuidle..usage
> > 141191 ± 2% +10.3% 155768 ± 3% meminfo.PageTables
> > 62180 +26.0% 78348 meminfo.Percpu
> > 2.20 ± 14% +3.5 5.67 ± 20% mpstat.cpu.all.idle%
> > 0.55 +0.2 0.72 ± 5% mpstat.cpu.all.irq%
> > 0.04 ± 2% +0.0 0.06 ± 5% mpstat.cpu.all.soft%
> > 448780 -2.9% 435554 hackbench.throughput
> > 440656 -2.6% 429130
> > hackbench.throughput_avg
> > 448780 -2.9% 435554
> > hackbench.throughput_best
> > 425797 -2.2% 416584
> > hackbench.throughput_worst
> > 90998790 -15.0% 77364427 ± 6%
> > hackbench.time.involuntary_context_switches
> > 12446 -3.9% 11960
> > hackbench.time.percent_of_cpu_this_job_got
> > 16057 -1.4% 15825
> > hackbench.time.system_time
> > 63421 -2.3% 61955 proc-
> > vmstat.nr_kernel_stack
> > 35455 ± 2% +10.0% 38991 ± 3% proc-
> > vmstat.nr_page_table_pages
> > 34542 +5.1% 36312 ± 2% proc-
> > vmstat.nr_slab_reclaimable
> > 151083 ± 16% +46.6% 221509 ± 17% proc-
> > vmstat.numa_hint_faults
> > 113731 ± 26% +64.7% 187314 ± 20% proc-
> > vmstat.numa_hint_faults_local
> > 133591 +3.1% 137709 proc-
> > vmstat.numa_other
> > 53696 ± 16% -28.6% 38362 ± 10% proc-
> > vmstat.numa_pages_migrated
> > 1053504 ± 2% +7.7% 1135052 ± 4% proc-vmstat.pgfault
> > 2077549 ± 3% +8.5% 2254157 ± 4% proc-vmstat.pgfree
> > 53696 ± 16% -28.6% 38362 ± 10% proc-
> > vmstat.pgmigrate_success
> > 4.941e+10 -2.6% 4.81e+10 perf-stat.i.branch-
> > instructions
> > 2.232e+08 -1.9% 2.189e+08 perf-stat.i.branch-
> > misses
> > 2.11e+09 -5.8% 1.989e+09 ± 2% perf-stat.i.cache-
> > references
> > 3.221e+11 -2.5% 3.141e+11 perf-stat.i.cpu-
> > cycles
> > 2.365e+11 -2.7% 2.303e+11 perf-
> > stat.i.instructions
> > 6787 ± 3% +8.0% 7327 ± 4% perf-stat.i.minor-
> > faults
> > 6789 ± 3% +8.0% 7329 ± 4% perf-stat.i.page-
> > faults
> > 4.904e+10 -2.5% 4.779e+10 perf-stat.ps.branch-
> > instructions
> > 2.215e+08 -1.8% 2.174e+08 perf-stat.ps.branch-
> > misses
> > 2.094e+09 -5.7% 1.974e+09 ± 2% perf-stat.ps.cache-
> > references
> > 3.197e+11 -2.4% 3.12e+11 perf-stat.ps.cpu-
> > cycles
> > 2.348e+11 -2.6% 2.288e+11 perf-
> > stat.ps.instructions
> > 6691 ± 3% +7.2% 7174 ± 4% perf-stat.ps.minor-
> > faults
> > 6693 ± 3% +7.2% 7176 ± 4% perf-stat.ps.page-
> > faults
> > 7475567 +16.4% 8699139
> > sched_debug.cfs_rq:/.avg_vruntime.avg
> > 8752154 ± 3% +20.6% 10551563 ± 4%
> > sched_debug.cfs_rq:/.avg_vruntime.max
> > 211424 ± 12% +374.5% 1003211 ± 39%
> > sched_debug.cfs_rq:/.avg_vruntime.stddev
> > 19.44 ± 6% +29.4% 25.17 ± 5%
> > sched_debug.cfs_rq:/.h_nr_queued.max
> > 4.49 ± 4% +33.5% 5.99 ± 4%
> > sched_debug.cfs_rq:/.h_nr_queued.stddev
> > 19.33 ± 6% +29.0% 24.94 ± 5%
> > sched_debug.cfs_rq:/.h_nr_runnable.max
> > 4.47 ± 4% +33.4% 5.96 ± 3%
> > sched_debug.cfs_rq:/.h_nr_runnable.stddev
> > 6446 ±223% +885.4% 63529 ± 57%
> > sched_debug.cfs_rq:/.left_deadline.avg
> > 825119 ±223% +613.5% 5886958 ± 44%
> > sched_debug.cfs_rq:/.left_deadline.max
> > 72645 ±223% +713.6% 591074 ± 49%
> > sched_debug.cfs_rq:/.left_deadline.stddev
> > 6446 ±223% +885.5% 63527 ± 57%
> > sched_debug.cfs_rq:/.left_vruntime.avg
> > 825080 ±223% +613.5% 5886805 ± 44%
> > sched_debug.cfs_rq:/.left_vruntime.max
> > 72642 ±223% +713.7% 591058 ± 49%
> > sched_debug.cfs_rq:/.left_vruntime.stddev
> > 4202 ± 8% +1115.1% 51069 ± 61%
> > sched_debug.cfs_rq:/.load.stddev
> > 367.11 +20.2% 441.44 ± 17%
> > sched_debug.cfs_rq:/.load_avg.max
> > 7475567 +16.4% 8699139
> > sched_debug.cfs_rq:/.min_vruntime.avg
> > 8752154 ± 3% +20.6% 10551563 ± 4%
> > sched_debug.cfs_rq:/.min_vruntime.max
> > 211424 ± 12% +374.5% 1003211 ± 39%
> > sched_debug.cfs_rq:/.min_vruntime.stddev
> > 0.17 ± 16% +39.8% 0.24 ± 6%
> > sched_debug.cfs_rq:/.nr_queued.stddev
> > 6446 ±223% +885.5% 63527 ± 57%
> > sched_debug.cfs_rq:/.right_vruntime.avg
> > 825080 ±223% +613.5% 5886805 ± 44%
> > sched_debug.cfs_rq:/.right_vruntime.max
> > 72642 ±223% +713.7% 591058 ± 49%
> > sched_debug.cfs_rq:/.right_vruntime.stddev
> > 752.39 ± 81% -81.4% 139.72 ± 53%
> > sched_debug.cfs_rq:/.runnable_avg.min
> > 2728 ± 3% +51.2% 4126 ± 8%
> > sched_debug.cfs_rq:/.runnable_avg.stddev
> > 265.50 ± 2% +12.3% 298.07 ± 2%
> > sched_debug.cfs_rq:/.util_avg.stddev
> > 686.78 ± 7% +23.4% 847.76 ± 6%
> > sched_debug.cfs_rq:/.util_est.stddev
> > 19.44 ± 5% +29.7% 25.22 ± 4%
> > sched_debug.cpu.nr_running.max
> > 4.48 ± 5% +34.4% 6.02 ± 3%
> > sched_debug.cpu.nr_running.stddev
> > 67323 ± 14% +130.3% 155017 ± 29%
> > sched_debug.cpu.nr_switches.stddev
> > -20.78 -18.2% -17.00
> > sched_debug.cpu.nr_uninterruptible.min
> > 0.13 ±100% -85.8% 0.02 ±163% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.pte_alloc_one
> > 0.17 ±116% -97.8% 0.00 ±223% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__get_user_pages.get_user_pag
> > es_remote.get_arg_page.copy_strings
> > 22.92 ±110% -97.4% 0.59 ±137% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_nop
> > rof
> > 8.10 ± 45% -78.0% 1.78 ±135% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.allocate_slab.___slab_alloc
> > 3.14 ± 19% -70.9% 0.91 ±102% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_track_caller_n
> > oprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
> > 39.05 ±149% -97.4% 1.01 ±223% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vm
> > as.free_pgtables.exit_mmap
> > 15.77 ±203% -99.7% 0.04 ±102% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__tlb_batch_free_encoded_page
> > s.tlb_finish_mmu.exit_mmap.__mmput
> > 1.27 ±177% -98.2% 0.02 ±190% perf-
> > sched.sch_delay.avg.ms.__cond_resched.down_read.acct_collect.do_exi
> > t.do_group_exit
> > 0.20 ±140% -92.4% 0.02 ±201% perf-
> > sched.sch_delay.avg.ms.__cond_resched.down_read.walk_component.link
> > _path_walk.path_openat
> > 86.63 ±221% -99.9% 0.05 ±184% perf-
> > sched.sch_delay.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.d
> > o_syscall_64
> > 0.18 ± 75% -97.0% 0.01 ±141% perf-
> > sched.sch_delay.avg.ms.__cond_resched.dput.step_into.link_path_walk
> > .path_openat
> > 0.13 ± 34% -75.5% 0.03 ±141% perf-
> > sched.sch_delay.avg.ms.__cond_resched.dput.terminate_walk.path_open
> > at.do_filp_open
> > 0.26 ±108% -86.2% 0.04 ±142% perf-
> > sched.sch_delay.avg.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_
> > exit
> > 2.33 ± 11% -65.8% 0.80 ±107% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.
> > __alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> > 0.18 ± 88% -91.1% 0.02 ±194% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc
> > _empty_file.path_openat.do_filp_open
> > 0.50 ±145% -92.5% 0.04 ±210% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getna
> > me_flags.part.0
> > 0.19 ±116% -98.5% 0.00 ±223% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.commit_merge
> > 0.24 ±128% -96.8% 0.01 ±180% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_ar
> > ea_dup.__split_vma.vms_gather_munmap_vmas
> > 0.99 ± 16% -58.0% 0.42 ±100% perf-
> > sched.sch_delay.avg.ms.__cond_resched.mutex_lock.unix_stream_read_g
> > eneric.unix_stream_recvmsg.sock_recvmsg
> > 0.27 ±124% -97.5% 0.01 ±141% perf-
> > sched.sch_delay.avg.ms.__cond_resched.remove_vma.exit_mmap.__mmput.
> > exit_mm
> > 1.08 ± 28% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.96 ± 93% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 0.53 ±182% -94.2% 0.03 ±158% perf-
> > sched.sch_delay.avg.ms.__cond_resched.wp_page_copy.__handle_mm_faul
> > t.handle_mm_fault.do_user_addr_fault
> > 0.84 ±160% -93.5% 0.05 ±100% perf-
> > sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.i
> > sra.0
> > 29.39 ±172% -94.0% 1.78 ±123% perf-
> > sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_6
> > 4
> > 21.51 ± 60% -74.7% 5.45 ±118% perf-
> > sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYS
> > CALL_64_after_hwframe
> > 13.77 ± 61% -81.3% 2.57 ±113% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown]
> > 11.22 ± 33% -74.5% 2.86 ±107% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 1.99 ± 90% -90.1% 0.20 ±100% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown].[unknown]
> > 4.50 ±138% -94.9% 0.23 ±200% perf-
> > sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_ep
> > oll_wait.__x64_sys_epoll_wait
> > 27.91 ±218% -99.6% 0.11 ±120% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 9.91 ± 51% -68.3% 3.15 ±124% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 10.18 ± 24% -62.4% 3.83 ±105% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.sock_alloc_send_pskb.unix_s
> > tream_sendmsg.sock_write_iter
> > 1.16 ± 20% -62.7% 0.43 ±106% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.unix_stream_read_generic.un
> > ix_stream_recvmsg.sock_recvmsg
> > 0.27 ± 99% -92.0% 0.02 ±172% perf-
> > sched.sch_delay.max.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.pte_alloc_one
> > 0.32 ±128% -98.9% 0.00 ±223% perf-
> > sched.sch_delay.max.ms.__cond_resched.__get_user_pages.get_user_pag
> > es_remote.get_arg_page.copy_strings
> > 0.88 ± 94% -86.7% 0.12 ±144% perf-
> > sched.sch_delay.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_e
> > vent_mmap_event.perf_event_mmap.__mmap_region
> > 252.53 ±128% -98.4% 4.12 ±138% perf-
> > sched.sch_delay.max.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_nop
> > rof
> > 60.22 ± 58% -67.8% 19.37 ±146% perf-
> > sched.sch_delay.max.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.allocate_slab.___slab_alloc
> > 168.93 ±209% -99.9% 0.15 ±100% perf-
> > sched.sch_delay.max.ms.__cond_resched.__tlb_batch_free_encoded_page
> > s.tlb_finish_mmu.exit_mmap.__mmput
> > 3.79 ±169% -98.6% 0.05 ±199% perf-
> > sched.sch_delay.max.ms.__cond_resched.down_read.acct_collect.do_exi
> > t.do_group_exit
> > 517.19 ±222% -99.9% 0.29 ±201% perf-
> > sched.sch_delay.max.ms.__cond_resched.dput.__fput.__x64_sys_close.d
> > o_syscall_64
> > 0.54 ± 82% -98.4% 0.01 ±141% perf-
> > sched.sch_delay.max.ms.__cond_resched.dput.step_into.link_path_walk
> > .path_openat
> > 0.34 ± 57% -93.1% 0.02 ±203% perf-
> > sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc
> > _empty_file.path_openat.do_filp_open
> > 0.64 ±141% -99.4% 0.00 ±223% perf-
> > sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.commit_merge
> > 0.28 ±111% -97.2% 0.01 ±180% perf-
> > sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.vm_ar
> > ea_dup.__split_vma.vms_gather_munmap_vmas
> > 0.29 ±114% -97.6% 0.01 ±141% perf-
> > sched.sch_delay.max.ms.__cond_resched.remove_vma.exit_mmap.__mmput.
> > exit_mm
> > 133.30 ± 46% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 12.53 ±135% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 1.11 ± 85% -76.9% 0.26 ±202% perf-
> > sched.sch_delay.max.ms.__cond_resched.unmap_vmas.vms_clear_ptes.par
> > t.0
> > 7.48 ±214% -99.0% 0.08 ±141% perf-
> > sched.sch_delay.max.ms.__cond_resched.wp_page_copy.__handle_mm_faul
> > t.handle_mm_fault.do_user_addr_fault
> > 28.59 ±191% -99.0% 0.28 ±120% perf-
> > sched.sch_delay.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.i
> > sra.0
> > 285.16 ±145% -99.3% 1.94 ±111% perf-
> > sched.sch_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_6
> > 4
> > 143.71 ±128% -91.0% 12.97 ±134% perf-
> > sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown].[unknown]
> > 107.10 ±162% -99.1% 0.95 ±190% perf-
> > sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_ep
> > oll_wait.__x64_sys_epoll_wait
> > 352.73 ±216% -99.4% 2.06 ±118% perf-
> > sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 1169 ± 25% -58.7% 482.79 ±101% perf-
> > sched.sch_delay.max.ms.schedule_timeout.unix_stream_read_generic.un
> > ix_stream_recvmsg.sock_recvmsg
> > 1.80 ± 20% -58.5% 0.75 ±105% perf-
> > sched.total_sch_delay.average.ms
> > 5.09 ± 20% -58.0% 2.14 ±106% perf-
> > sched.total_wait_and_delay.average.ms
> > 20.86 ± 25% -82.0% 3.76 ±147% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.al
> > loc_slab_obj_exts.allocate_slab.___slab_alloc
> > 8.10 ± 21% -69.1% 2.51 ±103% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__kmalloc_node_track_cal
> > ler_noprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
> > 22.82 ± 27% -66.9% 7.55 ±103% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine
> > _move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> > 6.55 ± 13% -64.1% 2.35 ±108% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_no
> > prof.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> > 139.95 ± 55% -64.0% 50.45 ±122% perf-
> > sched.wait_and_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.
> > ksys_read
> > 27.54 ± 61% -81.3% 5.15 ±113% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_a
> > pic_timer_interrupt.[unknown]
> > 27.75 ± 30% -73.3% 7.42 ±106% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_a
> > pic_timer_interrupt.[unknown].[unknown]
> > 26.76 ± 25% -64.2% 9.57 ±107% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_r
> > eschedule_ipi.[unknown].[unknown]
> > 29.39 ± 34% -67.3% 9.61 ±115% perf-
> > sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp
> > _kthread.kthread
> > 27.53 ± 25% -62.9% 10.21 ±105% perf-
> > sched.wait_and_delay.avg.ms.schedule_timeout.sock_alloc_send_pskb.u
> > nix_stream_sendmsg.sock_write_iter
> > 3.25 ± 20% -62.2% 1.23 ±106% perf-
> > sched.wait_and_delay.avg.ms.schedule_timeout.unix_stream_read_gener
> > ic.unix_stream_recvmsg.sock_recvmsg
> > 864.18 ± 4% -99.3% 6.27 ±103% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 141.47 ± 38% -72.9% 38.27 ±154% perf-
> > sched.wait_and_delay.max.ms.__cond_resched.__kmalloc_node_noprof.al
> > loc_slab_obj_exts.allocate_slab.___slab_alloc
> > 2346 ± 25% -58.7% 969.53 ±101% perf-
> > sched.wait_and_delay.max.ms.schedule_timeout.unix_stream_read_gener
> > ic.unix_stream_recvmsg.sock_recvmsg
> > 83.99 ±223% -100.0% 0.02 ±163% perf-
> > sched.wait_time.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.pte_alloc_one
> > 0.16 ±122% -97.7% 0.00 ±223% perf-
> > sched.wait_time.avg.ms.__cond_resched.__get_user_pages.get_user_pag
> > es_remote.get_arg_page.copy_strings
> > 12.76 ± 37% -81.6% 2.35 ±125% perf-
> > sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.allocate_slab.___slab_alloc
> > 4.96 ± 22% -67.9% 1.59 ±104% perf-
> > sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_track_caller_n
> > oprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
> > 75.22 ± 91% -96.4% 2.67 ±223% perf-
> > sched.wait_time.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vm
> > as.free_pgtables.exit_mmap
> > 23.31 ±188% -98.8% 0.28 ±195% perf-
> > sched.wait_time.avg.ms.__cond_resched.__tlb_batch_free_encoded_page
> > s.tlb_finish_mmu.exit_mmap.__mmput
> > 14.93 ± 22% -68.0% 4.78 ±104% perf-
> > sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move
> > _task.__set_cpus_allowed_ptr.__sched_setaffinity
> > 1.29 ±178% -98.5% 0.02 ±185% perf-
> > sched.wait_time.avg.ms.__cond_resched.down_read.acct_collect.do_exi
> > t.do_group_exit
> > 0.20 ±140% -92.5% 0.02 ±200% perf-
> > sched.wait_time.avg.ms.__cond_resched.down_read.walk_component.link
> > _path_walk.path_openat
> > 87.29 ±221% -99.9% 0.05 ±184% perf-
> > sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.d
> > o_syscall_64
> > 0.18 ± 76% -97.0% 0.01 ±141% perf-
> > sched.wait_time.avg.ms.__cond_resched.dput.step_into.link_path_walk
> > .path_openat
> > 0.12 ± 33% -87.4% 0.02 ±212% perf-
> > sched.wait_time.avg.ms.__cond_resched.dput.terminate_walk.path_open
> > at.do_filp_open
> > 4.22 ± 15% -63.3% 1.55 ±108% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.
> > __alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> > 0.18 ± 88% -91.1% 0.02 ±194% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc
> > _empty_file.path_openat.do_filp_open
> > 0.50 ±145% -92.5% 0.04 ±210% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getna
> > me_flags.part.0
> > 0.19 ±116% -98.5% 0.00 ±223% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.commit_merge
> > 0.24 ±128% -96.8% 0.01 ±180% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_ar
> > ea_dup.__split_vma.vms_gather_munmap_vmas
> > 1.79 ± 27% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.98 ± 92% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 2.44 ±199% -98.1% 0.05 ±109% perf-
> > sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.i
> > sra.0
> > 125.16 ± 52% -64.6% 44.36 ±120% perf-
> > sched.wait_time.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_
> > read
> > 13.77 ± 61% -81.3% 2.58 ±113% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown]
> > 16.53 ± 29% -72.5% 4.55 ±106% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 3.11 ± 80% -80.7% 0.60 ±138% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown].[unknown]
> > 17.30 ± 23% -65.0% 6.05 ±107% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown].[unknown]
> > 50.76 ±143% -98.1% 0.97 ±101% perf-
> > sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 19.48 ± 27% -66.8% 6.46 ±111% perf-
> > sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 17.35 ± 25% -63.3% 6.37 ±106% perf-
> > sched.wait_time.avg.ms.schedule_timeout.sock_alloc_send_pskb.unix_s
> > tream_sendmsg.sock_write_iter
> > 2.09 ± 21% -62.0% 0.79 ±107% perf-
> > sched.wait_time.avg.ms.schedule_timeout.unix_stream_read_generic.un
> > ix_stream_recvmsg.sock_recvmsg
> > 850.73 ± 6% -99.3% 5.76 ±102% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 168.00 ±223% -100.0% 0.02 ±172% perf-
> > sched.wait_time.max.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.pte_alloc_one
> > 0.32 ±131% -98.8% 0.00 ±223% perf-
> > sched.wait_time.max.ms.__cond_resched.__get_user_pages.get_user_pag
> > es_remote.get_arg_page.copy_strings
> > 0.88 ± 94% -86.7% 0.12 ±144% perf-
> > sched.wait_time.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_e
> > vent_mmap_event.perf_event_mmap.__mmap_region
> > 83.05 ± 45% -75.0% 20.78 ±142% perf-
> > sched.wait_time.max.ms.__cond_resched.__kmalloc_node_noprof.alloc_s
> > lab_obj_exts.allocate_slab.___slab_alloc
> > 393.39 ± 76% -96.3% 14.60 ±223% perf-
> > sched.wait_time.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vm
> > as.free_pgtables.exit_mmap
> > 3.87 ±170% -98.6% 0.05 ±199% perf-
> > sched.wait_time.max.ms.__cond_resched.down_read.acct_collect.do_exi
> > t.do_group_exit
> > 520.88 ±222% -99.9% 0.29 ±201% perf-
> > sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.d
> > o_syscall_64
> > 0.54 ± 82% -98.4% 0.01 ±141% perf-
> > sched.wait_time.max.ms.__cond_resched.dput.step_into.link_path_walk
> > .path_openat
> > 0.34 ± 57% -93.1% 0.02 ±203% perf-
> > sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc
> > _empty_file.path_openat.do_filp_open
> > 0.64 ±141% -99.4% 0.00 ±223% perf-
> > sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.commit_merge
> > 0.28 ±111% -97.2% 0.01 ±180% perf-
> > sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.vm_ar
> > ea_dup.__split_vma.vms_gather_munmap_vmas
> > 210.15 ± 42% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 34.48 ±131% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 1.11 ± 85% -76.9% 0.26 ±202% perf-
> > sched.wait_time.max.ms.__cond_resched.unmap_vmas.vms_clear_ptes.par
> > t.0
> > 92.32 ±212% -99.7% 0.27 ±123% perf-
> > sched.wait_time.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.i
> > sra.0
> > 3252 ± 21% -58.5% 1351 ±103% perf-
> > sched.wait_time.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_
> > read
> > 1602 ± 28% -66.2% 541.12 ±100% perf-
> > sched.wait_time.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 530.17 ± 95% -98.5% 7.79 ±119% perf-
> > sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 1177 ± 25% -58.7% 486.74 ±101% perf-
> > sched.wait_time.max.ms.schedule_timeout.unix_stream_read_generic.un
> > ix_stream_recvmsg.sock_recvmsg
> > 50.88 -1.4 49.53 perf-
> > profile.calltrace.cycles-pp.read
> > 45.95 -1.0 44.92 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> > 45.66 -1.0 44.64 perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 3.44 ± 4% -0.8 2.66 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.__wake_up_common.__wake_up_sync_key.sock_def_readable.unix_strea
> > m_sendmsg.sock_write_iter
> > 3.32 ± 4% -0.8 2.56 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.soc
> > k_def_readable.unix_stream_sendmsg
> > 3.28 ± 4% -0.8 2.52 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_
> > up_sync_key.sock_def_readable
> > 3.48 ± 3% -0.6 2.83 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.schedule.schedule_timeout.unix_stream_read_generic.unix_stream_r
> > ecvmsg.sock_recvmsg
> > 3.52 ± 3% -0.6 2.87 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.schedule_timeout.unix_stream_read_generic.unix_stream_recvmsg.so
> > ck_recvmsg.sock_read_iter
> > 3.45 ± 3% -0.6 2.80 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.__schedule.schedule.schedule_timeout.unix_stream_read_generic.un
> > ix_stream_recvmsg
> > 47.06 -0.6 46.45 perf-
> > profile.calltrace.cycles-pp.write
> > 4.26 ± 5% -0.6 3.69 perf-
> > profile.calltrace.cycles-
> > pp.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg.sock_wr
> > ite_iter.vfs_write
> > 1.58 ± 3% -0.6 1.02 ± 8% perf-
> > profile.calltrace.cycles-
> > pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_
> > up_common.__wake_up_sync_key
> > 1.31 ± 3% -0.5 0.85 ± 8% perf-
> > profile.calltrace.cycles-
> > pp.enqueue_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_fun
> > ction.__wake_up_common
> > 1.25 ± 3% -0.4 0.81 ± 8% perf-
> > profile.calltrace.cycles-
> > pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.try_to_wake_up.a
> > utoremove_wake_function
> > 0.84 ± 3% -0.2 0.60 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwfr
> > ame.read
> > 7.91 -0.2 7.68 perf-
> > profile.calltrace.cycles-
> > pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recv
> > msg.sock_recvmsg.sock_read_iter
> > 3.17 ± 2% -0.2 2.94 perf-
> > profile.calltrace.cycles-
> > pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.
> > vfs_write.ksys_write
> > 7.80 -0.2 7.58 perf-
> > profile.calltrace.cycles-
> > pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_g
> > eneric.unix_stream_recvmsg.sock_recvmsg
> > 7.58 -0.2 7.36 perf-
> > profile.calltrace.cycles-
> > pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_acto
> > r.unix_stream_read_generic.unix_stream_recvmsg
> > 1.22 ± 4% -0.2 1.02 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.try_to_block_task.__schedule.schedule.schedule_timeout.unix_stre
> > am_read_generic
> > 1.18 ± 4% -0.2 0.99 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.dequeue_task_fair.try_to_block_task.__schedule.schedule.schedule
> > _timeout
> > 0.87 -0.2 0.68 ± 8% perf-
> > profile.calltrace.cycles-
> > pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.schedul
> > e_timeout
> > 1.14 ± 4% -0.2 0.95 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule.
> > schedule
> > 0.90 -0.2 0.72 ± 7% perf-
> > profile.calltrace.cycles-
> > pp.__pick_next_task.__schedule.schedule.schedule_timeout.unix_strea
> > m_read_generic
> > 3.45 ± 3% -0.1 3.30 perf-
> > profile.calltrace.cycles-
> > pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_st
> > ream_read_actor.unix_stream_read_generic
> > 1.96 -0.1 1.82 perf-
> > profile.calltrace.cycles-pp.clear_bhb_loop.read
> > 1.97 -0.1 1.86 perf-
> > profile.calltrace.cycles-pp.clear_bhb_loop.write
> > 2.35 -0.1 2.25 perf-
> > profile.calltrace.cycles-
> > pp.__memcg_slab_post_alloc_hook.__kmalloc_node_track_caller_noprof.
> > kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
> > 2.58 -0.1 2.48 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64.read
> > 1.38 ± 4% -0.1 1.28 ± 2% perf-
> > profile.calltrace.cycles-
> > pp._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.
> > sock_write_iter.vfs_write
> > 1.35 -0.1 1.25 perf-
> > profile.calltrace.cycles-
> > pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_send
> > msg.sock_write_iter.vfs_write
> > 0.67 ± 7% -0.1 0.58 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.dequeue_entity.dequeue_entities.dequeue_task_fair.try_to_block_t
> > ask.__schedule
> > 2.59 -0.1 2.50 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64.write
> > 2.02 -0.1 1.96 perf-
> > profile.calltrace.cycles-
> > pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_noprof.__allo
> > c_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> > 0.77 ± 3% -0.0 0.72 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwfram
> > e.write
> > 0.65 ± 4% -0.0 0.60 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > .read
> > 0.74 -0.0 0.70 perf-
> > profile.calltrace.cycles-
> > pp.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_rec
> > vmsg.sock_read_iter
> > 1.04 -0.0 0.99 perf-
> > profile.calltrace.cycles-
> > pp.obj_cgroup_charge_account.__memcg_slab_post_alloc_hook.__kmalloc
> > _node_track_caller_noprof.kmalloc_reserve.__alloc_skb
> > 0.69 -0.0 0.65 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.check_heap_object.__check_object_size.skb_copy_datagram_from_ite
> > r.unix_stream_sendmsg.sock_write_iter
> > 0.82 -0.0 0.80 perf-
> > profile.calltrace.cycles-
> > pp.obj_cgroup_charge_account.__memcg_slab_post_alloc_hook.kmem_cach
> > e_alloc_node_noprof.__alloc_skb.alloc_skb_with_frags
> > 0.57 -0.0 0.56 perf-
> > profile.calltrace.cycles-
> > pp.refill_obj_stock.__memcg_slab_free_hook.kmem_cache_free.unix_str
> > eam_read_generic.unix_stream_recvmsg
> > 0.80 ± 9% +0.2 1.01 ± 8% perf-
> > profile.calltrace.cycles-
> > pp._raw_spin_lock_irqsave.__wake_up_sync_key.sock_def_readable.unix
> > _stream_sendmsg.sock_write_iter
> > 2.50 ± 4% +0.3 2.82 ± 9% perf-
> > profile.calltrace.cycles-
> > pp.___slab_alloc.kmem_cache_alloc_node_noprof.__alloc_skb.alloc_skb
> > _with_frags.sock_alloc_send_pskb
> > 2.64 ± 6% +0.4 3.06 ± 12% perf-
> > profile.calltrace.cycles-
> > pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__put_pa
> > rtials.kmem_cache_free.unix_stream_read_generic
> > 2.73 ± 6% +0.4 3.16 ± 12% perf-
> > profile.calltrace.cycles-
> > pp._raw_spin_lock_irqsave.__put_partials.kmem_cache_free.unix_strea
> > m_read_generic.unix_stream_recvmsg
> > 2.87 ± 6% +0.4 3.30 ± 12% perf-
> > profile.calltrace.cycles-
> > pp.__put_partials.kmem_cache_free.unix_stream_read_generic.unix_str
> > eam_recvmsg.sock_recvmsg
> > 18.38 +0.6 18.93 perf-
> > profile.calltrace.cycles-
> > pp.sock_alloc_send_pskb.unix_stream_sendmsg.sock_write_iter.vfs_wri
> > te.ksys_write
> > 0.00 +0.7 0.70 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_
> > enter.cpuidle_enter_state
> > 0.00 +0.8 0.76 ± 16% perf-
> > profile.calltrace.cycles-
> > pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_stream_send
> > msg.sock_write_iter.vfs_write
> > 0.00 +1.5 1.46 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_
> > state.cpuidle_enter
> > 0.00 +1.5 1.46 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_e
> > nter.cpuidle_idle_call
> > 0.00 +1.5 1.46 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_c
> > all.do_idle
> > 0.00 +1.5 1.50 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_
> > startup_entry
> > 0.00 +1.5 1.52 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_
> > secondary
> > 0.00 +1.6 1.61 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.comm
> > on_startup_64
> > 0.18 ±141% +1.8 1.93 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
> > 0.18 ±141% +1.8 1.94 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.cpu_startup_entry.start_secondary.common_startup_64
> > 0.18 ±141% +1.8 1.94 ± 11% perf-
> > profile.calltrace.cycles-pp.start_secondary.common_startup_64
> > 0.18 ±141% +1.8 1.97 ± 11% perf-
> > profile.calltrace.cycles-pp.common_startup_64
> > 0.00 +2.0 1.96 ± 11% perf-
> > profile.calltrace.cycles-
> > pp.asm_sysvec_call_function_single.pv_native_safe_halt.acpi_safe_ha
> > lt.acpi_idle_do_entry.acpi_idle_enter
> > 87.96 -1.4 86.57 perf-
> > profile.children.cycles-pp.do_syscall_64
> > 88.72 -1.4 87.33 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 51.44 -1.4 50.05 perf-
> > profile.children.cycles-pp.read
> > 4.55 ± 2% -0.8 3.74 ± 5% perf-
> > profile.children.cycles-pp.schedule
> > 3.76 ± 4% -0.7 3.02 ± 3% perf-
> > profile.children.cycles-pp.__wake_up_common
> > 3.64 ± 4% -0.7 2.92 ± 3% perf-
> > profile.children.cycles-pp.autoremove_wake_function
> > 3.60 ± 4% -0.7 2.90 ± 3% perf-
> > profile.children.cycles-pp.try_to_wake_up
> > 4.00 ± 2% -0.6 3.36 ± 4% perf-
> > profile.children.cycles-pp.schedule_timeout
> > 4.65 ± 2% -0.6 4.02 ± 4% perf-
> > profile.children.cycles-pp.__schedule
> > 47.64 -0.6 47.01 perf-
> > profile.children.cycles-pp.write
> > 4.58 ± 4% -0.5 4.06 perf-
> > profile.children.cycles-pp.__wake_up_sync_key
> > 1.45 ± 2% -0.4 1.00 ± 5% perf-
> > profile.children.cycles-pp.exit_to_user_mode_loop
> > 1.84 ± 3% -0.3 1.50 ± 3% perf-
> > profile.children.cycles-pp.ttwu_do_activate
> > 1.62 ± 2% -0.3 1.33 ± 3% perf-
> > profile.children.cycles-pp.enqueue_task
> > 1.53 ± 2% -0.3 1.26 ± 3% perf-
> > profile.children.cycles-pp.enqueue_task_fair
> > 1.40 -0.3 1.14 ± 6% perf-
> > profile.children.cycles-pp.pick_next_task_fair
> > 3.97 -0.2 3.73 perf-
> > profile.children.cycles-pp.clear_bhb_loop
> > 1.43 -0.2 1.19 ± 5% perf-
> > profile.children.cycles-pp.__pick_next_task
> > 0.75 ± 4% -0.2 0.52 ± 8% perf-
> > profile.children.cycles-pp.raw_spin_rq_lock_nested
> > 7.95 -0.2 7.72 perf-
> > profile.children.cycles-pp.unix_stream_read_actor
> > 7.84 -0.2 7.61 perf-
> > profile.children.cycles-pp.skb_copy_datagram_iter
> > 3.24 ± 2% -0.2 3.01 perf-
> > profile.children.cycles-pp.skb_copy_datagram_from_iter
> > 7.63 -0.2 7.42 perf-
> > profile.children.cycles-pp.__skb_datagram_iter
> > 0.94 ± 4% -0.2 0.73 ± 4% perf-
> > profile.children.cycles-pp.enqueue_entity
> > 0.95 ± 8% -0.2 0.76 ± 4% perf-
> > profile.children.cycles-pp.update_curr
> > 1.37 ± 3% -0.2 1.18 ± 3% perf-
> > profile.children.cycles-pp.dequeue_task_fair
> > 1.34 ± 4% -0.2 1.16 ± 3% perf-
> > profile.children.cycles-pp.try_to_block_task
> > 4.50 -0.2 4.34 perf-
> > profile.children.cycles-pp.__memcg_slab_post_alloc_hook
> > 1.37 ± 3% -0.2 1.20 ± 3% perf-
> > profile.children.cycles-pp.dequeue_entities
> > 3.48 ± 3% -0.1 3.33 perf-
> > profile.children.cycles-pp._copy_to_iter
> > 0.91 -0.1 0.78 ± 3% perf-
> > profile.children.cycles-pp.update_load_avg
> > 4.85 -0.1 4.72 perf-
> > profile.children.cycles-pp.__check_object_size
> > 3.23 -0.1 3.11 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64
> > 0.54 ± 3% -0.1 0.42 ± 5% perf-
> > profile.children.cycles-pp.switch_mm_irqs_off
> > 1.40 ± 4% -0.1 1.30 ± 2% perf-
> > profile.children.cycles-pp._copy_from_iter
> > 2.02 -0.1 1.92 perf-
> > profile.children.cycles-pp.its_return_thunk
> > 0.43 ± 2% -0.1 0.32 ± 3% perf-
> > profile.children.cycles-pp.switch_fpu_return
> > 0.29 ± 2% -0.1 0.18 ± 6% perf-
> > profile.children.cycles-pp.__enqueue_entity
> > 1.46 ± 3% -0.1 1.36 ± 2% perf-
> > profile.children.cycles-pp.fdget_pos
> > 0.44 ± 3% -0.1 0.34 ± 5% perf-
> > profile.children.cycles-pp.set_next_entity
> > 0.42 ± 2% -0.1 0.32 ± 4% perf-
> > profile.children.cycles-pp.pick_task_fair
> > 0.31 ± 2% -0.1 0.24 ± 6% perf-
> > profile.children.cycles-pp.reweight_entity
> > 0.28 ± 2% -0.1 0.20 ± 7% perf-
> > profile.children.cycles-pp.__dequeue_entity
> > 1.96 -0.1 1.88 perf-
> > profile.children.cycles-pp.obj_cgroup_charge_account
> > 0.28 ± 2% -0.1 0.21 ± 3% perf-
> > profile.children.cycles-pp.update_cfs_group
> > 0.23 ± 2% -0.1 0.16 ± 5% perf-
> > profile.children.cycles-pp.pick_eevdf
> > 0.26 ± 2% -0.1 0.19 ± 4% perf-
> > profile.children.cycles-pp.wakeup_preempt
> > 1.46 -0.1 1.40 perf-
> > profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.48 ± 2% -0.1 0.42 ± 5% perf-
> > profile.children.cycles-pp.__rseq_handle_notify_resume
> > 0.30 -0.1 0.24 ± 4% perf-
> > profile.children.cycles-pp.restore_fpregs_from_fpstate
> > 0.82 -0.1 0.77 perf-
> > profile.children.cycles-pp.__cond_resched
> > 0.27 ± 2% -0.0 0.22 ± 4% perf-
> > profile.children.cycles-pp.__update_load_avg_se
> > 0.14 ± 3% -0.0 0.10 ± 7% perf-
> > profile.children.cycles-pp.update_curr_se
> > 0.79 -0.0 0.74 perf-
> > profile.children.cycles-pp.mutex_lock
> > 0.34 ± 3% -0.0 0.30 ± 5% perf-
> > profile.children.cycles-pp.rseq_ip_fixup
> > 0.15 ± 4% -0.0 0.11 ± 5% perf-
> > profile.children.cycles-pp.asm_sysvec_reschedule_ipi
> > 0.21 ± 3% -0.0 0.16 ± 4% perf-
> > profile.children.cycles-pp.__switch_to
> > 0.17 ± 4% -0.0 0.13 ± 7% perf-
> > profile.children.cycles-pp.place_entity
> > 0.22 -0.0 0.18 ± 2% perf-
> > profile.children.cycles-pp.wake_affine
> > 0.24 -0.0 0.20 ± 2% perf-
> > profile.children.cycles-pp.check_stack_object
> > 0.64 ± 2% -0.0 0.61 ± 3% perf-
> > profile.children.cycles-pp.__virt_addr_valid
> > 0.38 ± 2% -0.0 0.34 ± 2% perf-
> > profile.children.cycles-pp.tick_nohz_handler
> > 0.18 ± 3% -0.0 0.14 ± 6% perf-
> > profile.children.cycles-pp.update_rq_clock
> > 0.66 -0.0 0.62 perf-
> > profile.children.cycles-pp.rw_verify_area
> > 0.19 -0.0 0.16 ± 4% perf-
> > profile.children.cycles-pp.task_mm_cid_work
> > 0.34 ± 3% -0.0 0.31 ± 2% perf-
> > profile.children.cycles-pp.update_process_times
> > 0.12 ± 8% -0.0 0.08 ± 11% perf-
> > profile.children.cycles-pp.detach_tasks
> > 0.39 ± 3% -0.0 0.36 ± 2% perf-
> > profile.children.cycles-pp.__hrtimer_run_queues
> > 0.21 ± 3% -0.0 0.18 ± 6% perf-
> > profile.children.cycles-pp.__update_load_avg_cfs_rq
> > 0.18 ± 6% -0.0 0.15 ± 4% perf-
> > profile.children.cycles-pp.task_tick_fair
> > 0.25 ± 3% -0.0 0.22 ± 4% perf-
> > profile.children.cycles-pp.rseq_get_rseq_cs
> > 0.23 ± 5% -0.0 0.20 ± 3% perf-
> > profile.children.cycles-pp.sched_tick
> > 0.14 ± 3% -0.0 0.11 ± 6% perf-
> > profile.children.cycles-pp.check_preempt_wakeup_fair
> > 0.11 ± 4% -0.0 0.08 ± 7% perf-
> > profile.children.cycles-pp.update_min_vruntime
> > 0.06 -0.0 0.03 ± 70% perf-
> > profile.children.cycles-pp.update_curr_dl_se
> > 0.14 ± 3% -0.0 0.12 ± 5% perf-
> > profile.children.cycles-pp.put_prev_entity
> > 0.13 ± 5% -0.0 0.10 ± 3% perf-
> > profile.children.cycles-pp.task_h_load
> > 0.68 -0.0 0.65 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> > 0.46 ± 2% -0.0 0.43 ± 2% perf-
> > profile.children.cycles-pp.hrtimer_interrupt
> > 0.52 -0.0 0.50 perf-
> > profile.children.cycles-pp.scm_recv_unix
> > 0.08 ± 4% -0.0 0.06 ± 9% perf-
> > profile.children.cycles-pp.__cgroup_account_cputime
> > 0.11 ± 5% -0.0 0.09 ± 4% perf-
> > profile.children.cycles-pp.__switch_to_asm
> > 0.46 ± 2% -0.0 0.44 ± 2% perf-
> > profile.children.cycles-pp.__sysvec_apic_timer_interrupt
> > 0.08 ± 8% -0.0 0.06 ± 9% perf-
> > profile.children.cycles-pp.activate_task
> > 0.08 ± 8% -0.0 0.06 ± 9% perf-
> > profile.children.cycles-pp.detach_task
> > 0.11 ± 5% -0.0 0.09 ± 7% perf-
> > profile.children.cycles-pp.os_xsave
> > 0.13 ± 5% -0.0 0.11 ± 6% perf-
> > profile.children.cycles-pp.avg_vruntime
> > 0.13 ± 4% -0.0 0.11 ± 5% perf-
> > profile.children.cycles-pp.update_entity_lag
> > 0.08 ± 4% -0.0 0.06 ± 7% perf-
> > profile.children.cycles-pp.__calc_delta
> > 0.09 ± 5% -0.0 0.07 ± 8% perf-
> > profile.children.cycles-pp.vruntime_eligible
> > 0.34 ± 2% -0.0 0.32 perf-
> > profile.children.cycles-pp._raw_spin_unlock_irqrestore
> > 0.30 -0.0 0.29 ± 2% perf-
> > profile.children.cycles-pp.__build_skb_around
> > 0.08 ± 5% -0.0 0.07 ± 6% perf-
> > profile.children.cycles-pp.rseq_update_cpu_node_id
> > 0.15 -0.0 0.14 perf-
> > profile.children.cycles-pp.security_socket_getpeersec_dgram
> > 0.07 ± 5% +0.0 0.09 ± 5% perf-
> > profile.children.cycles-pp.native_irq_return_iret
> > 0.38 ± 2% +0.0 0.40 ± 2% perf-
> > profile.children.cycles-pp.mod_memcg_lruvec_state
> > 0.27 ± 2% +0.0 0.30 ± 2% perf-
> > profile.children.cycles-pp.prepare_task_switch
> > 0.05 ± 7% +0.0 0.08 ± 8% perf-
> > profile.children.cycles-pp.handle_softirqs
> > 0.06 +0.0 0.09 ± 11% perf-
> > profile.children.cycles-pp.finish_wait
> > 0.06 ± 7% +0.0 0.11 ± 6% perf-
> > profile.children.cycles-pp.__irq_exit_rcu
> > 0.06 ± 8% +0.1 0.11 ± 8% perf-
> > profile.children.cycles-pp.ttwu_queue_wakelist
> > 0.01 ±223% +0.1 0.07 ± 10% perf-
> > profile.children.cycles-pp.ktime_get
> > 0.54 ± 4% +0.1 0.61 perf-
> > profile.children.cycles-pp.select_task_rq
> > 0.00 +0.1 0.07 ± 10% perf-
> > profile.children.cycles-pp.enqueue_dl_entity
> > 0.12 ± 4% +0.1 0.19 ± 7% perf-
> > profile.children.cycles-pp.get_any_partial
> > 0.10 ± 9% +0.1 0.18 ± 5% perf-
> > profile.children.cycles-pp.available_idle_cpu
> > 0.00 +0.1 0.08 ± 9% perf-
> > profile.children.cycles-pp.hrtimer_start_range_ns
> > 0.00 +0.1 0.08 ± 11% perf-
> > profile.children.cycles-pp.dl_server_start
> > 0.00 +0.1 0.08 ± 11% perf-
> > profile.children.cycles-pp.dl_server_stop
> > 0.46 ± 2% +0.1 0.54 ± 2% perf-
> > profile.children.cycles-pp.select_task_rq_fair
> > 0.00 +0.1 0.10 ± 10% perf-
> > profile.children.cycles-pp.select_idle_core
> > 0.09 ± 7% +0.1 0.20 ± 8% perf-
> > profile.children.cycles-pp.select_idle_cpu
> > 0.18 ± 4% +0.1 0.31 ± 6% perf-
> > profile.children.cycles-pp.select_idle_sibling
> > 0.00 +0.2 0.18 ± 4% perf-
> > profile.children.cycles-pp.process_one_work
> > 0.06 ± 13% +0.2 0.25 ± 9% perf-
> > profile.children.cycles-pp.schedule_idle
> > 0.44 ± 2% +0.2 0.64 ± 8% perf-
> > profile.children.cycles-pp.prepare_to_wait
> > 0.00 +0.2 0.21 ± 5% perf-
> > profile.children.cycles-pp.kthread
> > 0.00 +0.2 0.21 ± 5% perf-
> > profile.children.cycles-pp.worker_thread
> > 0.00 +0.2 0.21 ± 4% perf-
> > profile.children.cycles-pp.ret_from_fork
> > 0.00 +0.2 0.21 ± 4% perf-
> > profile.children.cycles-pp.ret_from_fork_asm
> > 0.11 ± 12% +0.3 0.36 ± 9% perf-
> > profile.children.cycles-pp.sched_ttwu_pending
> > 0.31 ± 35% +0.3 0.59 ± 11% perf-
> > profile.children.cycles-pp.__cmd_record
> > 0.26 ± 45% +0.3 0.54 ± 13% perf-
> > profile.children.cycles-pp.perf_session__process_events
> > 0.26 ± 45% +0.3 0.54 ± 13% perf-
> > profile.children.cycles-pp.reader__read_event
> > 0.26 ± 45% +0.3 0.54 ± 13% perf-
> > profile.children.cycles-pp.record__finish_output
> > 0.16 ± 11% +0.3 0.45 ± 9% perf-
> > profile.children.cycles-pp.__flush_smp_call_function_queue
> > 0.14 ± 11% +0.3 0.45 ± 9% perf-
> > profile.children.cycles-pp.__sysvec_call_function_single
> > 0.14 ± 60% +0.3 0.48 ± 17% perf-
> > profile.children.cycles-pp.ordered_events__queue
> > 0.14 ± 61% +0.3 0.48 ± 17% perf-
> > profile.children.cycles-pp.queue_event
> > 0.15 ± 59% +0.3 0.49 ± 16% perf-
> > profile.children.cycles-pp.process_simple
> > 0.16 ± 12% +0.4 0.54 ± 10% perf-
> > profile.children.cycles-pp.sysvec_call_function_single
> > 4.61 ± 3% +0.5 5.13 ± 8% perf-
> > profile.children.cycles-pp.get_partial_node
> > 5.57 ± 3% +0.6 6.12 ± 7% perf-
> > profile.children.cycles-pp.___slab_alloc
> > 18.44 +0.6 19.00 perf-
> > profile.children.cycles-pp.sock_alloc_send_pskb
> > 6.51 ± 3% +0.7 7.26 ± 9% perf-
> > profile.children.cycles-pp.__put_partials
> > 0.33 ± 14% +1.0 1.30 ± 11% perf-
> > profile.children.cycles-pp.asm_sysvec_call_function_single
> > 0.34 ± 17% +1.1 1.47 ± 11% perf-
> > profile.children.cycles-pp.pv_native_safe_halt
> > 0.34 ± 17% +1.1 1.48 ± 11% perf-
> > profile.children.cycles-pp.acpi_safe_halt
> > 0.34 ± 17% +1.1 1.48 ± 11% perf-
> > profile.children.cycles-pp.acpi_idle_do_entry
> > 0.34 ± 17% +1.1 1.48 ± 11% perf-
> > profile.children.cycles-pp.acpi_idle_enter
> > 0.35 ± 17% +1.2 1.53 ± 11% perf-
> > profile.children.cycles-pp.cpuidle_enter_state
> > 0.35 ± 17% +1.2 1.54 ± 11% perf-
> > profile.children.cycles-pp.cpuidle_enter
> > 0.38 ± 17% +1.3 1.63 ± 11% perf-
> > profile.children.cycles-pp.cpuidle_idle_call
> > 0.45 ± 16% +1.5 1.94 ± 11% perf-
> > profile.children.cycles-pp.start_secondary
> > 0.46 ± 17% +1.5 1.96 ± 11% perf-
> > profile.children.cycles-pp.do_idle
> > 0.46 ± 17% +1.5 1.97 ± 11% perf-
> > profile.children.cycles-pp.common_startup_64
> > 0.46 ± 17% +1.5 1.97 ± 11% perf-
> > profile.children.cycles-pp.cpu_startup_entry
> > 13.76 ± 2% +1.7 15.44 ± 5% perf-
> > profile.children.cycles-pp._raw_spin_lock_irqsave
> > 12.09 ± 2% +1.9 14.00 ± 6% perf-
> > profile.children.cycles-pp.native_queued_spin_lock_slowpath
> > 3.93 -0.2 3.69 perf-
> > profile.self.cycles-pp.clear_bhb_loop
> > 3.43 ± 3% -0.1 3.29 perf-
> > profile.self.cycles-pp._copy_to_iter
> > 0.50 ± 2% -0.1 0.39 ± 5% perf-
> > profile.self.cycles-pp.switch_mm_irqs_off
> > 1.37 ± 4% -0.1 1.27 ± 2% perf-
> > profile.self.cycles-pp._copy_from_iter
> > 0.28 ± 2% -0.1 0.18 ± 7% perf-
> > profile.self.cycles-pp.__enqueue_entity
> > 1.41 ± 3% -0.1 1.31 ± 2% perf-
> > profile.self.cycles-pp.fdget_pos
> > 2.51 -0.1 2.42 perf-
> > profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> > 1.35 -0.1 1.28 perf-
> > profile.self.cycles-pp.read
> > 2.24 -0.1 2.17 perf-
> > profile.self.cycles-pp.do_syscall_64
> > 0.27 ± 3% -0.1 0.20 ± 3% perf-
> > profile.self.cycles-pp.update_cfs_group
> > 1.28 -0.1 1.22 perf-
> > profile.self.cycles-pp.sock_write_iter
> > 0.84 -0.1 0.77 perf-
> > profile.self.cycles-pp.vfs_read
> > 1.42 -0.1 1.36 perf-
> > profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 1.20 -0.1 1.14 perf-
> > profile.self.cycles-pp.__alloc_skb
> > 0.18 ± 2% -0.1 0.13 ± 5% perf-
> > profile.self.cycles-pp.pick_eevdf
> > 1.04 -0.1 0.99 perf-
> > profile.self.cycles-pp.its_return_thunk
> > 0.29 ± 2% -0.1 0.24 ± 4% perf-
> > profile.self.cycles-pp.restore_fpregs_from_fpstate
> > 0.28 ± 5% -0.1 0.23 ± 6% perf-
> > profile.self.cycles-pp.update_curr
> > 0.13 ± 5% -0.0 0.08 ± 5% perf-
> > profile.self.cycles-pp.switch_fpu_return
> > 0.20 ± 3% -0.0 0.15 ± 6% perf-
> > profile.self.cycles-pp.__dequeue_entity
> > 1.00 -0.0 0.95 perf-
> > profile.self.cycles-pp.kmem_cache_alloc_node_noprof
> > 0.33 -0.0 0.28 ± 2% perf-
> > profile.self.cycles-pp.update_load_avg
> > 0.88 -0.0 0.83 ± 2% perf-
> > profile.self.cycles-pp.vfs_write
> > 0.91 -0.0 0.86 perf-
> > profile.self.cycles-pp.sock_read_iter
> > 0.13 ± 3% -0.0 0.08 ± 4% perf-
> > profile.self.cycles-pp.update_curr_se
> > 0.25 ± 2% -0.0 0.21 ± 4% perf-
> > profile.self.cycles-pp.__update_load_avg_se
> > 1.22 -0.0 1.18 perf-
> > profile.self.cycles-pp.__kmalloc_node_track_caller_noprof
> > 0.68 -0.0 0.63 perf-
> > profile.self.cycles-pp.__check_object_size
> > 0.78 ± 2% -0.0 0.74 perf-
> > profile.self.cycles-pp.obj_cgroup_charge_account
> > 0.20 ± 3% -0.0 0.16 ± 4% perf-
> > profile.self.cycles-pp.__switch_to
> > 0.15 ± 3% -0.0 0.11 ± 4% perf-
> > profile.self.cycles-pp.try_to_wake_up
> > 0.90 -0.0 0.86 perf-
> > profile.self.cycles-pp.entry_SYSCALL_64
> > 0.76 ± 2% -0.0 0.73 perf-
> > profile.self.cycles-pp.__check_heap_object
> > 0.92 -0.0 0.89 ± 2% perf-
> > profile.self.cycles-pp.__account_obj_stock
> > 0.19 ± 2% -0.0 0.16 ± 2% perf-
> > profile.self.cycles-pp.check_stack_object
> > 0.40 ± 3% -0.0 0.37 perf-
> > profile.self.cycles-pp.__schedule
> > 0.60 ± 2% -0.0 0.56 ± 3% perf-
> > profile.self.cycles-pp.__virt_addr_valid
> > 0.71 -0.0 0.68 perf-
> > profile.self.cycles-pp.__skb_datagram_iter
> > 0.18 ± 4% -0.0 0.14 ± 5% perf-
> > profile.self.cycles-pp.task_mm_cid_work
> > 0.68 -0.0 0.65 perf-
> > profile.self.cycles-pp.refill_obj_stock
> > 0.34 -0.0 0.31 ± 2% perf-
> > profile.self.cycles-pp.unix_stream_recvmsg
> > 0.06 ± 7% -0.0 0.03 ± 70% perf-
> > profile.self.cycles-pp.enqueue_task
> > 0.11 -0.0 0.08 perf-
> > profile.self.cycles-pp.pick_task_fair
> > 0.15 ± 2% -0.0 0.12 ± 3% perf-
> > profile.self.cycles-pp.enqueue_task_fair
> > 0.20 ± 3% -0.0 0.17 ± 7% perf-
> > profile.self.cycles-pp.__update_load_avg_cfs_rq
> > 0.41 -0.0 0.38 perf-
> > profile.self.cycles-pp.sock_recvmsg
> > 0.10 -0.0 0.07 ± 6% perf-
> > profile.self.cycles-pp.update_min_vruntime
> > 0.13 ± 3% -0.0 0.10 perf-
> > profile.self.cycles-pp.task_h_load
> > 0.23 ± 3% -0.0 0.20 ± 6% perf-
> > profile.self.cycles-pp.__get_user_8
> > 0.12 ± 4% -0.0 0.10 ± 3% perf-
> > profile.self.cycles-pp.exit_to_user_mode_loop
> > 0.39 ± 2% -0.0 0.37 ± 2% perf-
> > profile.self.cycles-pp.rw_verify_area
> > 0.11 ± 3% -0.0 0.09 ± 7% perf-
> > profile.self.cycles-pp.os_xsave
> > 0.12 ± 3% -0.0 0.10 ± 3% perf-
> > profile.self.cycles-pp.pick_next_task_fair
> > 0.35 -0.0 0.33 ± 2% perf-
> > profile.self.cycles-pp.skb_copy_datagram_from_iter
> > 0.46 -0.0 0.44 perf-
> > profile.self.cycles-pp.mutex_lock
> > 0.11 ± 4% -0.0 0.09 ± 4% perf-
> > profile.self.cycles-pp.__switch_to_asm
> > 0.10 ± 3% -0.0 0.08 ± 5% perf-
> > profile.self.cycles-pp.enqueue_entity
> > 0.08 ± 7% -0.0 0.06 ± 6% perf-
> > profile.self.cycles-pp.place_entity
> > 0.30 -0.0 0.28 ± 2% perf-
> > profile.self.cycles-pp.alloc_skb_with_frags
> > 0.50 -0.0 0.48 perf-
> > profile.self.cycles-pp.kfree
> > 0.30 -0.0 0.28 perf-
> > profile.self.cycles-pp.ksys_write
> > 0.12 ± 3% -0.0 0.10 ± 3% perf-
> > profile.self.cycles-pp.dequeue_entity
> > 0.11 ± 4% -0.0 0.09 perf-
> > profile.self.cycles-pp.prepare_to_wait
> > 0.19 ± 2% -0.0 0.17 perf-
> > profile.self.cycles-pp.update_rq_clock_task
> > 0.27 -0.0 0.25 ± 2% perf-
> > profile.self.cycles-pp.__build_skb_around
> > 0.08 ± 6% -0.0 0.06 ± 9% perf-
> > profile.self.cycles-pp.vruntime_eligible
> > 0.12 ± 4% -0.0 0.10 perf-
> > profile.self.cycles-pp.__wake_up_common
> > 0.27 -0.0 0.26 perf-
> > profile.self.cycles-pp.kmalloc_reserve
> > 0.48 -0.0 0.46 perf-
> > profile.self.cycles-pp.unix_write_space
> > 0.19 -0.0 0.18 ± 2% perf-
> > profile.self.cycles-pp.skb_copy_datagram_iter
> > 0.07 -0.0 0.06 ± 6% perf-
> > profile.self.cycles-pp.__calc_delta
> > 0.06 ± 6% -0.0 0.05 perf-
> > profile.self.cycles-pp.__put_user_8
> > 0.28 -0.0 0.27 perf-
> > profile.self.cycles-pp._raw_spin_unlock_irqrestore
> > 0.11 -0.0 0.10 perf-
> > profile.self.cycles-pp.wait_for_unix_gc
> > 0.05 +0.0 0.06 perf-
> > profile.self.cycles-pp.__x64_sys_write
> > 0.07 ± 5% +0.0 0.08 ± 5% perf-
> > profile.self.cycles-pp.native_irq_return_iret
> > 0.19 ± 7% +0.0 0.22 ± 4% perf-
> > profile.self.cycles-pp.prepare_task_switch
> > 0.10 ± 6% +0.1 0.17 ± 5% perf-
> > profile.self.cycles-pp.available_idle_cpu
> > 0.14 ± 61% +0.3 0.48 ± 17% perf-
> > profile.self.cycles-pp.queue_event
> > 0.19 ± 18% +0.7 0.85 ± 12% perf-
> > profile.self.cycles-pp.pv_native_safe_halt
> > 12.07 ± 2% +1.9 13.97 ± 6% perf-
> > profile.self.cycles-pp.native_queued_spin_lock_slowpath
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-ivb-2ep2: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2
> > @ 2.70GHz (Ivy Bridge-EP) with 64G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/t
> > esttime:
> > gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-
> > 20240206.cgz/lkp-ivb-2ep2/shell_rtns_3/aim9/300s
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 9156 +20.2% 11004 vmstat.system.cs
> > 8715946 ± 6% -14.0% 7494314 ± 13% meminfo.DirectMap2M
> > 10992 +85.4% 20381 meminfo.PageTables
> > 318.58 -1.7% 313.01
> > aim9.shell_rtns_3.ops_per_sec
> > 27145198 -2.1% 26576524
> > aim9.time.minor_page_faults
> > 1049306 -1.8% 1030938
> > aim9.time.voluntary_context_switches
> > 6173 ± 20% +74.0% 10742 ± 4% numa-
> > meminfo.node0.PageTables
> > 5702 ± 31% +55.1% 8844 ± 19% numa-
> > meminfo.node0.Shmem
> > 4803 ± 25% +100.6% 9636 ± 6% numa-
> > meminfo.node1.PageTables
> > 1538 ± 20% +73.7% 2673 ± 5% numa-
> > vmstat.node0.nr_page_table_pages
> > 1425 ± 31% +55.1% 2210 ± 19% numa-
> > vmstat.node0.nr_shmem
> > 1194 ± 25% +101.2% 2402 ± 6% numa-
> > vmstat.node1.nr_page_table_pages
> > 30413 +19.3% 36291
> > sched_debug.cpu.nr_switches.avg
> > 84768 ± 6% +20.3% 101955 ± 4%
> > sched_debug.cpu.nr_switches.max
> > 25510 ± 13% +23.0% 31383 ± 3%
> > sched_debug.cpu.nr_switches.stddev
> > 2727 +85.8% 5066 proc-
> > vmstat.nr_page_table_pages
> > 19325131 -1.6% 19014535 proc-vmstat.numa_hit
> > 19274656 -1.6% 18964467 proc-
> > vmstat.numa_local
> > 19877211 -1.6% 19563123 proc-
> > vmstat.pgalloc_normal
> > 28020416 -2.0% 27451741 proc-vmstat.pgfault
> > 19829318 -1.6% 19508263 proc-vmstat.pgfree
> > 2679 -1.6% 2636 proc-
> > vmstat.unevictable_pgs_culled
> > 0.03 ± 10% +30.9% 0.04 ± 2% perf-
> > sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.02 ± 5% +26.2% 0.02 ± 3% perf-
> > sched.total_sch_delay.average.ms
> > 27.03 ± 2% -12.4% 23.66 perf-
> > sched.total_wait_and_delay.average.ms
> > 23171 +18.2% 27385 perf-
> > sched.total_wait_and_delay.count.ms
> > 27.01 ± 2% -12.5% 23.64 perf-
> > sched.total_wait_time.average.ms
> > 110.73 ± 4% -71.1% 31.98 perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 1662 ± 2% +278.6% 6294 perf-
> > sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 110.70 ± 4% -71.1% 31.94 perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 5.94 +0.1 6.00 perf-stat.i.branch-
> > miss-rate%
> > 9184 +20.2% 11041 perf-stat.i.context-
> > switches
> > 1.96 +1.6% 1.99 perf-stat.i.cpi
> > 71.73 ± 4% +66.1% 119.11 ± 5% perf-stat.i.cpu-
> > migrations
> > 0.53 -1.4% 0.52 perf-stat.i.ipc
> > 3.79 -2.0% 3.71 perf-
> > stat.i.metric.K/sec
> > 90919 -2.0% 89065 perf-stat.i.minor-
> > faults
> > 90919 -2.0% 89065 perf-stat.i.page-
> > faults
> > 6.00 +0.1 6.06 perf-
> > stat.overall.branch-miss-rate%
> > 1.79 +1.2% 1.81 perf-
> > stat.overall.cpi
> > 0.56 -1.2% 0.55 perf-
> > stat.overall.ipc
> > 9154 +20.2% 11004 perf-
> > stat.ps.context-switches
> > 71.49 ± 4% +66.1% 118.72 ± 5% perf-stat.ps.cpu-
> > migrations
> > 90616 -2.0% 88768 perf-stat.ps.minor-
> > faults
> > 90616 -2.0% 88768 perf-stat.ps.page-
> > faults
> > 8.89 -0.2 8.68 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 8.88 -0.2 8.66 perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 3.47 ± 2% -0.2 3.29 perf-
> > profile.calltrace.cycles-
> > pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64
> > _after_hwframe
> > 3.47 ± 2% -0.2 3.29 perf-
> > profile.calltrace.cycles-
> > pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe
> > 3.51 ± 3% -0.2 3.33 perf-
> > profile.calltrace.cycles-
> > pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 3.47 ± 2% -0.2 3.29 perf-
> > profile.calltrace.cycles-
> > pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_sysca
> > ll_64
> > 1.66 ± 2% -0.1 1.57 ± 4% perf-
> > profile.calltrace.cycles-pp.setlocale
> > 0.27 ±100% +0.3 0.61 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.common_s
> > tartup_64
> > 0.18 ±141% +0.4 0.60 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_seconda
> > ry
> > 62.46 +0.6 63.01 perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_
> > startup_entry
> > 0.09 ±223% +0.6 0.65 ± 7% perf-
> > profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > 0.09 ±223% +0.6 0.65 ± 7% perf-
> > profile.calltrace.cycles-pp.ret_from_fork_asm
> > 49.01 +0.6 49.60 perf-
> > profile.calltrace.cycles-
> > pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.d
> > o_idle
> > 67.47 +0.7 68.17 perf-
> > profile.calltrace.cycles-pp.common_startup_64
> > 20.25 -0.7 19.58 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 20.21 -0.7 19.54 perf-
> > profile.children.cycles-pp.do_syscall_64
> > 6.54 -0.2 6.33 perf-
> > profile.children.cycles-pp.asm_exc_page_fault
> > 6.10 -0.2 5.90 perf-
> > profile.children.cycles-pp.do_user_addr_fault
> > 3.77 ± 3% -0.2 3.60 perf-
> > profile.children.cycles-pp.x64_sys_call
> > 3.62 ± 3% -0.2 3.46 perf-
> > profile.children.cycles-pp.do_exit
> > 2.63 ± 3% -0.2 2.48 ± 2% perf-
> > profile.children.cycles-pp.__mmput
> > 2.16 ± 2% -0.1 2.06 ± 3% perf-
> > profile.children.cycles-pp.ksys_mmap_pgoff
> > 1.66 ± 2% -0.1 1.57 ± 4% perf-
> > profile.children.cycles-pp.setlocale
> > 2.69 ± 2% -0.1 2.61 perf-
> > profile.children.cycles-pp.do_pte_missing
> > 0.77 ± 5% -0.1 0.70 ± 6% perf-
> > profile.children.cycles-pp.tlb_finish_mmu
> > 0.92 ± 2% -0.0 0.87 ± 4% perf-
> > profile.children.cycles-pp.__irqentry_text_end
> > 0.08 ± 10% -0.0 0.04 ± 71% perf-
> > profile.children.cycles-pp.tick_nohz_tick_stopped
> > 0.10 ± 11% -0.0 0.07 ± 21% perf-
> > profile.children.cycles-pp.__percpu_counter_init_many
> > 0.14 ± 9% -0.0 0.11 ± 4% perf-
> > profile.children.cycles-pp.strnlen
> > 0.12 ± 11% -0.0 0.10 ± 8% perf-
> > profile.children.cycles-pp.mas_prev_slot
> > 0.11 ± 12% +0.0 0.14 ± 9% perf-
> > profile.children.cycles-pp.update_curr
> > 0.19 ± 8% +0.0 0.22 ± 6% perf-
> > profile.children.cycles-pp.enqueue_entity
> > 0.10 ± 11% +0.0 0.13 ± 11% perf-
> > profile.children.cycles-pp.__perf_event_task_sched_out
> > 0.05 ± 46% +0.0 0.08 ± 13% perf-
> > profile.children.cycles-pp.select_task_rq
> > 0.13 ± 14% +0.0 0.17 ± 8% perf-
> > profile.children.cycles-pp.perf_pmu_sched_task
> > 0.20 ± 10% +0.0 0.24 ± 2% perf-
> > profile.children.cycles-pp.try_to_wake_up
> > 0.28 ± 9% +0.1 0.34 ± 9% perf-
> > profile.children.cycles-pp.exit_to_user_mode_loop
> > 0.04 ± 44% +0.1 0.11 ± 13% perf-
> > profile.children.cycles-pp.__queue_work
> > 0.30 ± 11% +0.1 0.38 ± 8% perf-
> > profile.children.cycles-pp.ttwu_do_activate
> > 0.30 ± 4% +0.1 0.38 ± 8% perf-
> > profile.children.cycles-pp.__pick_next_task
> > 0.22 ± 7% +0.1 0.29 ± 9% perf-
> > profile.children.cycles-pp.try_to_block_task
> > 0.02 ±141% +0.1 0.09 ± 10% perf-
> > profile.children.cycles-pp.kick_pool
> > 0.02 ± 99% +0.1 0.10 ± 19% perf-
> > profile.children.cycles-pp.queue_work_on
> > 0.25 ± 4% +0.1 0.35 ± 7% perf-
> > profile.children.cycles-pp.sched_ttwu_pending
> > 0.33 ± 6% +0.1 0.43 ± 5% perf-
> > profile.children.cycles-pp.flush_smp_call_function_queue
> > 0.29 ± 4% +0.1 0.39 ± 6% perf-
> > profile.children.cycles-pp.__flush_smp_call_function_queue
> > 0.51 ± 6% +0.1 0.63 ± 6% perf-
> > profile.children.cycles-pp.schedule_idle
> > 0.46 ± 7% +0.1 0.58 ± 5% perf-
> > profile.children.cycles-pp.schedule
> > 0.88 ± 6% +0.2 1.04 ± 5% perf-
> > profile.children.cycles-pp.ret_from_fork_asm
> > 0.18 ± 6% +0.2 0.34 ± 8% perf-
> > profile.children.cycles-pp.worker_thread
> > 0.88 ± 6% +0.2 1.04 ± 5% perf-
> > profile.children.cycles-pp.ret_from_fork
> > 0.38 ± 8% +0.2 0.56 ± 10% perf-
> > profile.children.cycles-pp.kthread
> > 1.08 ± 3% +0.2 1.32 ± 2% perf-
> > profile.children.cycles-pp.__schedule
> > 66.15 +0.5 66.64 perf-
> > profile.children.cycles-pp.cpuidle_idle_call
> > 62.89 +0.6 63.47 perf-
> > profile.children.cycles-pp.cpuidle_enter_state
> > 63.00 +0.6 63.59 perf-
> > profile.children.cycles-pp.cpuidle_enter
> > 49.10 +0.6 49.69 perf-
> > profile.children.cycles-pp.intel_idle
> > 67.47 +0.7 68.17 perf-
> > profile.children.cycles-pp.do_idle
> > 67.47 +0.7 68.17 perf-
> > profile.children.cycles-pp.common_startup_64
> > 67.47 +0.7 68.17 perf-
> > profile.children.cycles-pp.cpu_startup_entry
> > 0.91 ± 2% -0.0 0.86 ± 4% perf-
> > profile.self.cycles-pp.__irqentry_text_end
> > 0.14 ± 11% +0.1 0.22 ± 11% perf-
> > profile.self.cycles-pp.timerqueue_del
> > 49.08 +0.6 49.68 perf-
> > profile.self.cycles-pp.intel_idle
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU
> > @ 2.00GHz (Ice Lake) with 256G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/ro
> > otfs/tbox_group/testcase:
> > gcc-12/performance/pipe/4/x86_64-rhel-9.4/process/800%/debian-
> > 12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 3745213 ± 39% +108.1% 7794858 ± 12% cpuidle..usage
> > 186670 +17.3% 218939 ± 2% meminfo.Percpu
> > 5.00 +306.7% 20.33 ± 66%
> > mpstat.max_utilization.seconds
> > 9.35 ± 76% -4.5 4.80 ±141% perf-
> > profile.calltrace.cycles-
> > pp.__ordered_events__flush.perf_session__process_events.record__fin
> > ish_output.__cmd_record
> > 8.90 ± 75% -4.3 4.57 ±141% perf-
> > profile.calltrace.cycles-
> > pp.perf_session__deliver_event.__ordered_events__flush.perf_session
> > __process_events.record__finish_output.__cmd_record
> > 3283 ± 7% -16.2% 2751 ± 5%
> > sched_debug.cfs_rq:/.avg_vruntime.avg
> > 3283 ± 7% -16.2% 2751 ± 5%
> > sched_debug.cfs_rq:/.min_vruntime.avg
> > 1522512 ± 6% +80.0% 2739797 ± 4% vmstat.system.cs
> > 308726 ± 8% +60.5% 495472 ± 5% vmstat.system.in
> > 467562 +3.7% 485068 ± 2% proc-
> > vmstat.nr_kernel_stack
> > 266084 +3.8% 276310 proc-
> > vmstat.nr_slab_unreclaimable
> > 1.375e+08 -2.0% 1.347e+08 proc-vmstat.numa_hit
> > 1.373e+08 -2.0% 1.346e+08 proc-
> > vmstat.numa_local
> > 217472 ± 3% -28.1% 156410 proc-
> > vmstat.numa_other
> > 1.382e+08 -2.0% 1.354e+08 proc-
> > vmstat.pgalloc_normal
> > 1.375e+08 -2.0% 1.347e+08 proc-vmstat.pgfree
> > 1514102 -6.2% 1420287 hackbench.throughput
> > 1480357 -6.7% 1380775
> > hackbench.throughput_avg
> > 1514102 -6.2% 1420287
> > hackbench.throughput_best
> > 1436918 -7.9% 1323413
> > hackbench.throughput_worst
> > 14551264 ± 13% +138.1% 34644707 ± 3%
> > hackbench.time.involuntary_context_switches
> > 9919 -1.6% 9762
> > hackbench.time.percent_of_cpu_this_job_got
> > 4239 +4.5% 4428
> > hackbench.time.system_time
> > 56365933 ± 6% +65.3% 93172066 ± 4%
> > hackbench.time.voluntary_context_switches
> > 65085618 +26.7% 82440571 ± 2% perf-stat.i.branch-
> > misses
> > 31.25 -1.6 29.66 perf-stat.i.cache-
> > miss-rate%
> > 2.469e+08 +8.9% 2.689e+08 perf-stat.i.cache-
> > misses
> > 7.519e+08 +15.9% 8.712e+08 perf-stat.i.cache-
> > references
> > 1353061 ± 7% +87.5% 2537450 ± 5% perf-stat.i.context-
> > switches
> > 2.269e+11 +3.5% 2.348e+11 perf-stat.i.cpu-
> > cycles
> > 134588 ± 13% +81.9% 244825 ± 8% perf-stat.i.cpu-
> > migrations
> > 13.60 ± 5% +70.5% 23.20 ± 5% perf-
> > stat.i.metric.K/sec
> > 1.26 +7.6% 1.35 perf-
> > stat.overall.MPKI
> > 0.11 ± 2% +0.0 0.14 ± 2% perf-
> > stat.overall.branch-miss-rate%
> > 34.12 -2.1 31.97 perf-
> > stat.overall.cache-miss-rate%
> > 1.17 +1.8% 1.19 perf-
> > stat.overall.cpi
> > 931.96 -5.3% 882.44 perf-
> > stat.overall.cycles-between-cache-misses
> > 0.85 -1.8% 0.84 perf-
> > stat.overall.ipc
> > 5.372e+10 -1.2% 5.31e+10 perf-stat.ps.branch-
> > instructions
> > 57783128 ± 2% +32.9% 76802898 ± 2% perf-stat.ps.branch-
> > misses
> > 2.696e+08 +7.2% 2.89e+08 perf-stat.ps.cache-
> > misses
> > 7.902e+08 +14.4% 9.039e+08 perf-stat.ps.cache-
> > references
> > 1288664 ± 7% +94.6% 2508227 ± 5% perf-
> > stat.ps.context-switches
> > 2.512e+11 +1.5% 2.55e+11 perf-stat.ps.cpu-
> > cycles
> > 122960 ± 14% +82.3% 224127 ± 9% perf-stat.ps.cpu-
> > migrations
> > 1.108e+13 +5.7% 1.171e+13 perf-
> > stat.total.instructions
> > 0.94 ±223% +5929.9% 56.62 ±121% perf-
> > sched.sch_delay.avg.ms.__cond_resched.process_one_work.worker_threa
> > d.kthread.ret_from_fork
> > 26.44 ± 81% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 100.25 ±141% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 9.01 ± 43% +1823.1% 173.24 ±106% perf-
> > sched.sch_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_
> > read
> > 49.43 ± 14% +73.8% 85.93 ± 19% perf-
> > sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall
> > _64
> > 130.63 ± 17% +135.8% 308.04 ± 28% perf-
> > sched.sch_delay.avg.ms.anon_pipe_write.vfs_write.ksys_write.do_sysc
> > all_64
> > 18.09 ± 30% +130.4% 41.70 ± 26% perf-
> > sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 196.51 ± 21% +102.9% 398.77 ± 15% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 34.17 ± 39% +191.1% 99.46 ± 20% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown].[unknown]
> > 154.91 ±163% +1649.9% 2710 ± 91% perf-
> > sched.sch_delay.max.ms.__cond_resched.anon_pipe_write.vfs_write.ksy
> > s_write.do_syscall_64
> > 0.94 ±223% +1.9e+05% 1743 ±120% perf-
> > sched.sch_delay.max.ms.__cond_resched.process_one_work.worker_threa
> > d.kthread.ret_from_fork
> > 3.19 ±124% -91.9% 0.26 ±150% perf-
> > sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret
> > _from_fork.ret_from_fork_asm
> > 646.26 ± 94% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 282.66 ±139% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 63.17 ± 52% +2854.4% 1866 ±121% perf-
> > sched.sch_delay.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_
> > read
> > 1507 ± 35% +249.4% 5266 ± 47% perf-
> > sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 3915 ± 67% +98.7% 7779 ± 16% perf-
> > sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 53.31 ± 18% +79.9% 95.90 ± 23% perf-
> > sched.total_sch_delay.average.ms
> > 149.37 ± 18% +80.0% 268.92 ± 22% perf-
> > sched.total_wait_and_delay.average.ms
> > 96.07 ± 18% +80.1% 173.01 ± 21% perf-
> > sched.total_wait_time.average.ms
> > 244.53 ± 47% -100.0% 0.00 perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.anon_pipe_rea
> > d.vfs_read.ksys_read
> > 529.64 ± 20% +38.5% 733.60 ± 20% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.anon_pipe_wri
> > te.vfs_write.ksys_write
> > 136.52 ± 15% +73.7% 237.07 ± 18% perf-
> > sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_sy
> > scall_64
> > 373.41 ± 16% +136.3% 882.34 ± 27% perf-
> > sched.wait_and_delay.avg.ms.anon_pipe_write.vfs_write.ksys_write.do
> > _syscall_64
> > 51.96 ± 29% +127.5% 118.22 ± 25% perf-
> > sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe.[unknown]
> > 554.86 ± 23% +103.0% 1126 ± 14% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_a
> > pic_timer_interrupt.[unknown].[unknown]
> > 298.52 ±136% +436.9% 1602 ± 27% perf-
> > sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_sch
> > edule_timeout.constprop.0.do_poll
> > 556.66 ± 37% -97.1% 16.09 ± 47% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 707.67 ± 31% -100.0% 0.00 perf-
> > sched.wait_and_delay.count.__cond_resched.mutex_lock.anon_pipe_read
> > .vfs_read.ksys_read
> > 1358 ± 28% +4707.9% 65291 ± 27% perf-
> > sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 12184 ± 5% -100.0% 0.00 perf-
> > sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.anon_pipe_rea
> > d.vfs_read.ksys_read
> > 1393 ±134% +379.9% 6685 ± 15% perf-
> > sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_sch
> > edule_timeout.constprop.0.do_poll
> > 6927 ± 6% +119.8% 15224 ± 19% perf-
> > sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 341.61 ± 21% +39.1% 475.15 ± 20% perf-
> > sched.wait_time.avg.ms.__cond_resched.mutex_lock.anon_pipe_write.vf
> > s_write.ksys_write
> > 51.39 ± 99% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 121.14 ±122% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 87.09 ± 15% +73.6% 151.14 ± 18% perf-
> > sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall
> > _64
> > 242.78 ± 16% +136.6% 574.31 ± 27% perf-
> > sched.wait_time.avg.ms.anon_pipe_write.vfs_write.ksys_write.do_sysc
> > all_64
> > 33.86 ± 29% +126.0% 76.52 ± 24% perf-
> > sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 250.32 ±109% -89.4% 26.44 ±111% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_common_interr
> > upt.[unknown].[unknown]
> > 358.36 ± 25% +103.1% 727.72 ± 14% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 77.40 ± 47% +102.5% 156.70 ± 28% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown].[unknown]
> > 17.91 ± 42% -75.3% 4.42 ± 76% perf-
> > sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_fro
> > m_fork_asm
> > 266.70 ±137% +431.6% 1417 ± 36% perf-
> > sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule
> > _timeout.constprop.0.do_poll
> > 536.93 ± 40% -97.4% 13.81 ± 50% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 180.38 ±135% +2208.8% 4164 ± 71% perf-
> > sched.wait_time.max.ms.__cond_resched.anon_pipe_write.vfs_write.ksy
> > s_write.do_syscall_64
> > 1028 ±129% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 312.94 ±123% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 418.66 ±132% -93.7% 26.44 ±111% perf-
> > sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_common_interr
> > upt.[unknown].[unknown]
> > 1388 ±133% +379.7% 6660 ± 15% perf-
> > sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule
> > _timeout.constprop.0.do_poll
> > 2022 ± 25% +164.9% 5358 ± 46% perf-
> > sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-ivb-2ep2: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2
> > @ 2.70GHz (Ivy Bridge-EP) with 64G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/t
> > esttime:
> > gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-
> > 20240206.cgz/lkp-ivb-2ep2/shell_rtns_1/aim9/300s
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 11004 +86.2% 20490 meminfo.PageTables
> > 121.33 ± 12% +18.8% 144.17 ± 5% perf-c2c.DRAM.remote
> > 9155 +20.0% 10990 vmstat.system.cs
> > 5129 ± 20% +107.2% 10631 ± 3% numa-
> > meminfo.node0.PageTables
> > 5864 ± 17% +67.3% 9811 ± 3% numa-
> > meminfo.node1.PageTables
> > 1278 ± 20% +107.9% 2658 ± 3% numa-
> > vmstat.node0.nr_page_table_pages
> > 1469 ± 17% +66.4% 2446 ± 3% numa-
> > vmstat.node1.nr_page_table_pages
> > 319.43 -2.1% 312.66
> > aim9.shell_rtns_1.ops_per_sec
> > 27217846 -2.5% 26546962
> > aim9.time.minor_page_faults
> > 1051878 -2.1% 1029547
> > aim9.time.voluntary_context_switches
> > 30502 +18.6% 36187
> > sched_debug.cpu.nr_switches.avg
> > 90327 ± 12% +22.7% 110866 ± 4%
> > sched_debug.cpu.nr_switches.max
> > 26316 ± 16% +25.5% 33021 ± 5%
> > sched_debug.cpu.nr_switches.stddev
> > 0.03 ± 7% +70.7% 0.05 ± 53% perf-
> > sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.02 ± 3% +38.9% 0.02 ± 28% perf-
> > sched.total_sch_delay.average.ms
> > 27.43 ± 2% -14.5% 23.45 perf-
> > sched.total_wait_and_delay.average.ms
> > 23174 +18.0% 27340 perf-
> > sched.total_wait_and_delay.count.ms
> > 27.41 ± 2% -14.6% 23.42 perf-
> > sched.total_wait_time.average.ms
> > 115.38 ± 3% -71.9% 32.37 ± 2% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 1656 ± 3% +280.2% 6299 perf-
> > sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 115.35 ± 3% -72.0% 32.31 ± 2% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 2737 +86.1% 5095 proc-
> > vmstat.nr_page_table_pages
> > 30460 +3.2% 31439 proc-vmstat.nr_shmem
> > 27933 +1.8% 28432 proc-
> > vmstat.nr_slab_unreclaimable
> > 19466749 -2.5% 18980434 proc-vmstat.numa_hit
> > 19414531 -2.5% 18927584 proc-
> > vmstat.numa_local
> > 20028107 -2.5% 19528806 proc-
> > vmstat.pgalloc_normal
> > 28087705 -2.4% 27417155 proc-vmstat.pgfault
> > 19980173 -2.5% 19474402 proc-vmstat.pgfree
> > 420074 -5.7% 396239 ± 8% proc-vmstat.pgreuse
> > 2685 -1.9% 2633 proc-
> > vmstat.unevictable_pgs_culled
> > 5.48e+08 -1.2% 5.412e+08 perf-stat.i.branch-
> > instructions
> > 5.92 +0.1 6.00 perf-stat.i.branch-
> > miss-rate%
> > 9195 +19.9% 11021 perf-stat.i.context-
> > switches
> > 1.96 +1.7% 1.99 perf-stat.i.cpi
> > 70.13 +73.4% 121.59 ± 8% perf-stat.i.cpu-
> > migrations
> > 2.725e+09 -1.3% 2.69e+09 perf-
> > stat.i.instructions
> > 0.53 -1.6% 0.52 perf-stat.i.ipc
> > 3.80 -2.4% 3.71 perf-
> > stat.i.metric.K/sec
> > 91139 -2.4% 88949 perf-stat.i.minor-
> > faults
> > 91139 -2.4% 88949 perf-stat.i.page-
> > faults
> > 5.00 ± 44% +1.1 6.07 perf-
> > stat.overall.branch-miss-rate%
> > 1.49 ± 44% +21.9% 1.82 perf-
> > stat.overall.cpi
> > 7643 ± 44% +43.7% 10984 perf-
> > stat.ps.context-switches
> > 58.17 ± 44% +108.4% 121.21 ± 8% perf-stat.ps.cpu-
> > migrations
> > 2.06 ± 2% -0.2 1.87 ± 12% perf-
> > profile.calltrace.cycles-
> > pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCAL
> > L_64_after_hwframe
> > 0.98 ± 7% -0.2 0.83 ± 12% perf-
> > profile.calltrace.cycles-
> > pp.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic_timer_interr
> > upt.asm_sysvec_apic_timer_interrupt
> > 1.69 ± 2% -0.1 1.54 ± 2% perf-
> > profile.calltrace.cycles-pp.setlocale
> > 0.58 ± 5% -0.1 0.44 ± 44% perf-
> > profile.calltrace.cycles-
> > pp.entry_SYSCALL_64_after_hwframe.__open64_nocancel.setlocale
> > 0.72 ± 6% -0.1 0.60 ± 8% perf-
> > profile.calltrace.cycles-
> > pp.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu.sysvec_apic
> > _timer_interrupt
> > 3.21 ± 2% -0.1 3.11 perf-
> > profile.calltrace.cycles-
> > pp.exec_binprm.bprm_execve.do_execveat_common.__x64_sys_execve.do_s
> > yscall_64
> > 0.70 ± 4% -0.1 0.62 ± 6% perf-
> > profile.calltrace.cycles-
> > pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
> > 1.52 ± 2% -0.1 1.44 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_aft
> > er_hwframe
> > 1.34 ± 3% -0.1 1.28 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.__mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_6
> > 4
> > 0.89 ± 3% -0.1 0.84 perf-
> > profile.calltrace.cycles-
> > pp.perf_mux_hrtimer_handler.__hrtimer_run_queues.hrtimer_interrupt.
> > __sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
> > 0.17 ±141% +0.4 0.61 ± 7% perf-
> > profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > 0.17 ±141% +0.4 0.61 ± 7% perf-
> > profile.calltrace.cycles-pp.ret_from_fork_asm
> > 65.10 +0.5 65.56 perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.comm
> > on_startup_64
> > 66.40 +0.6 67.00 perf-
> > profile.calltrace.cycles-
> > pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
> > 66.46 +0.6 67.08 perf-
> > profile.calltrace.cycles-pp.start_secondary.common_startup_64
> > 66.46 +0.6 67.08 perf-
> > profile.calltrace.cycles-
> > pp.cpu_startup_entry.start_secondary.common_startup_64
> > 67.63 +0.7 68.30 perf-
> > profile.calltrace.cycles-pp.common_startup_64
> > 20.14 -0.6 19.51 perf-
> > profile.children.cycles-pp.do_syscall_64
> > 20.20 -0.6 19.57 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 1.13 ± 5% -0.2 0.98 ± 9% perf-
> > profile.children.cycles-pp.rcu_core
> > 1.69 ± 2% -0.1 1.54 ± 2% perf-
> > profile.children.cycles-pp.setlocale
> > 0.84 ± 4% -0.1 0.71 ± 5% perf-
> > profile.children.cycles-pp.rcu_do_batch
> > 2.16 ± 2% -0.1 2.04 ± 3% perf-
> > profile.children.cycles-pp.ksys_mmap_pgoff
> > 1.15 ± 4% -0.1 1.04 ± 5% perf-
> > profile.children.cycles-pp.__open64_nocancel
> > 3.22 ± 2% -0.1 3.12 perf-
> > profile.children.cycles-pp.exec_binprm
> > 2.09 ± 2% -0.1 2.00 ± 2% perf-
> > profile.children.cycles-pp.kernel_clone
> > 0.88 ± 4% -0.1 0.79 ± 4% perf-
> > profile.children.cycles-pp.mas_store_prealloc
> > 2.19 -0.1 2.10 ± 3% perf-
> > profile.children.cycles-pp.__x64_sys_openat
> > 0.70 ± 4% -0.1 0.62 ± 6% perf-
> > profile.children.cycles-pp.dup_mm
> > 1.36 ± 3% -0.1 1.30 perf-
> > profile.children.cycles-pp._Fork
> > 0.56 ± 4% -0.1 0.50 ± 8% perf-
> > profile.children.cycles-pp.dup_mmap
> > 0.09 ± 16% -0.1 0.03 ± 70% perf-
> > profile.children.cycles-pp.perf_adjust_freq_unthr_context
> > 0.31 ± 8% -0.1 0.25 ± 10% perf-
> > profile.children.cycles-pp.strncpy_from_user
> > 0.94 ± 3% -0.1 0.88 ± 2% perf-
> > profile.children.cycles-pp.perf_mux_hrtimer_handler
> > 0.41 ± 5% -0.0 0.36 ± 5% perf-
> > profile.children.cycles-pp.irqtime_account_irq
> > 0.18 ± 12% -0.0 0.14 ± 7% perf-
> > profile.children.cycles-pp.tlb_remove_table_rcu
> > 0.20 ± 7% -0.0 0.17 ± 9% perf-
> > profile.children.cycles-pp.perf_event_task_tick
> > 0.08 ± 14% -0.0 0.05 ± 49% perf-
> > profile.children.cycles-pp.mas_update_gap
> > 0.24 ± 5% -0.0 0.21 ± 5% perf-
> > profile.children.cycles-pp.filemap_read
> > 0.19 ± 7% -0.0 0.16 ± 8% perf-
> > profile.children.cycles-pp.__call_rcu_common
> > 0.22 ± 2% -0.0 0.19 ± 5% perf-
> > profile.children.cycles-pp.mas_next_slot
> > 0.09 ± 5% +0.0 0.12 ± 7% perf-
> > profile.children.cycles-pp.__perf_event_task_sched_out
> > 0.05 ± 47% +0.0 0.08 ± 10% perf-
> > profile.children.cycles-pp.lru_gen_del_folio
> > 0.10 ± 14% +0.0 0.12 ± 18% perf-
> > profile.children.cycles-pp.__folio_mod_stat
> > 0.12 ± 12% +0.0 0.16 ± 3% perf-
> > profile.children.cycles-pp.perf_pmu_sched_task
> > 0.20 ± 10% +0.0 0.24 ± 4% perf-
> > profile.children.cycles-pp.prepare_task_switch
> > 0.06 ± 47% +0.0 0.10 ± 11% perf-
> > profile.children.cycles-pp.__queue_work
> > 0.56 ± 5% +0.1 0.61 ± 4% perf-
> > profile.children.cycles-pp.sched_balance_domains
> > 0.04 ± 72% +0.1 0.09 ± 11% perf-
> > profile.children.cycles-pp.kick_pool
> > 0.04 ± 72% +0.1 0.09 ± 14% perf-
> > profile.children.cycles-pp.queue_work_on
> > 0.33 ± 6% +0.1 0.38 ± 7% perf-
> > profile.children.cycles-pp.dequeue_entities
> > 0.35 ± 6% +0.1 0.40 ± 7% perf-
> > profile.children.cycles-pp.dequeue_task_fair
> > 0.52 ± 6% +0.1 0.58 ± 5% perf-
> > profile.children.cycles-pp.enqueue_task_fair
> > 0.54 ± 7% +0.1 0.60 ± 5% perf-
> > profile.children.cycles-pp.enqueue_task
> > 0.28 ± 9% +0.1 0.35 ± 5% perf-
> > profile.children.cycles-pp.exit_to_user_mode_loop
> > 0.21 ± 4% +0.1 0.28 ± 12% perf-
> > profile.children.cycles-pp.try_to_block_task
> > 0.34 ± 4% +0.1 0.42 ± 3% perf-
> > profile.children.cycles-pp.ttwu_do_activate
> > 0.36 ± 3% +0.1 0.46 ± 6% perf-
> > profile.children.cycles-pp.flush_smp_call_function_queue
> > 0.28 ± 4% +0.1 0.38 ± 5% perf-
> > profile.children.cycles-pp.sched_ttwu_pending
> > 0.33 ± 2% +0.1 0.43 ± 5% perf-
> > profile.children.cycles-pp.__flush_smp_call_function_queue
> > 0.46 ± 7% +0.1 0.56 ± 6% perf-
> > profile.children.cycles-pp.schedule
> > 0.48 ± 8% +0.1 0.61 ± 8% perf-
> > profile.children.cycles-pp.timerqueue_del
> > 0.18 ± 13% +0.1 0.32 ± 11% perf-
> > profile.children.cycles-pp.worker_thread
> > 0.38 ± 9% +0.2 0.52 ± 10% perf-
> > profile.children.cycles-pp.kthread
> > 1.10 ± 5% +0.2 1.25 ± 2% perf-
> > profile.children.cycles-pp.__schedule
> > 0.85 ± 8% +0.2 1.01 ± 7% perf-
> > profile.children.cycles-pp.ret_from_fork
> > 0.85 ± 8% +0.2 1.02 ± 7% perf-
> > profile.children.cycles-pp.ret_from_fork_asm
> > 63.15 +0.5 63.64 perf-
> > profile.children.cycles-pp.cpuidle_enter
> > 66.26 +0.5 66.77 perf-
> > profile.children.cycles-pp.cpuidle_idle_call
> > 66.46 +0.6 67.08 perf-
> > profile.children.cycles-pp.start_secondary
> > 67.63 +0.7 68.30 perf-
> > profile.children.cycles-pp.common_startup_64
> > 67.63 +0.7 68.30 perf-
> > profile.children.cycles-pp.cpu_startup_entry
> > 67.63 +0.7 68.30 perf-
> > profile.children.cycles-pp.do_idle
> > 1.20 ± 3% -0.1 1.12 ± 4% perf-
> > profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.09 ± 16% -0.1 0.03 ± 70% perf-
> > profile.self.cycles-pp.perf_adjust_freq_unthr_context
> > 0.25 ± 6% -0.0 0.21 ± 12% perf-
> > profile.self.cycles-pp.irqtime_account_irq
> > 0.02 ±141% +0.0 0.06 ± 13% perf-
> > profile.self.cycles-pp.prepend_path
> > 0.13 ± 10% +0.1 0.24 ± 11% perf-
> > profile.self.cycles-pp.timerqueue_del
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU
> > @ 2.00GHz (Ice Lake) with 256G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/ro
> > otfs/tbox_group/testcase:
> > gcc-12/performance/pipe/4/x86_64-rhel-9.4/process/50%/debian-12-
> > x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 3.924e+08 ± 3% +55.1% 6.086e+08 ± 2% cpuidle..time
> > 7504886 ± 11% +184.4% 21340245 ± 6% cpuidle..usage
> > 13350305 -3.8% 12848570 vmstat.system.cs
> > 1849619 +5.1% 1943754 vmstat.system.in
> > 3.56 ± 5% +2.6 6.16 ± 7% mpstat.cpu.all.idle%
> > 0.69 +0.2 0.90 ± 3% mpstat.cpu.all.irq%
> > 0.03 ± 3% +0.0 0.04 ± 3% mpstat.cpu.all.soft%
> > 18666 ± 9% +41.2% 26352 ± 6% perf-c2c.DRAM.remote
> > 197041 -39.6% 118945 ± 5% perf-c2c.HITM.local
> > 3178 ± 12% +37.2% 4361 ± 11% perf-c2c.HITM.remote
> > 200219 -38.4% 123307 ± 5% perf-c2c.HITM.total
> > 2842579 ± 11% +60.1% 4550025 ± 12% meminfo.Active
> > 2842579 ± 11% +60.1% 4550025 ± 12% meminfo.Active(anon)
> > 5535242 ± 5% +30.9% 7248257 ± 7% meminfo.Cached
> > 3846718 ± 8% +44.0% 5539484 ± 9% meminfo.Committed_AS
> > 9684149 ± 3% +20.5% 11666616 ± 4% meminfo.Memused
> > 136127 ± 3% +14.2% 155524 meminfo.PageTables
> > 62144 +22.8% 76336 meminfo.Percpu
> > 2001586 ± 16% +85.6% 3714611 ± 14% meminfo.Shmem
> > 9759598 ± 3% +20.0% 11714619 ± 4% meminfo.max_used_kB
> > 710625 ± 11% +59.3% 1131770 ± 11% proc-
> > vmstat.nr_active_anon
> > 1383631 ± 5% +30.6% 1806419 ± 7% proc-
> > vmstat.nr_file_pages
> > 34220 ± 3% +13.9% 38987 proc-
> > vmstat.nr_page_table_pages
> > 500216 ± 16% +84.5% 923007 ± 14% proc-vmstat.nr_shmem
> > 710625 ± 11% +59.3% 1131770 ± 11% proc-
> > vmstat.nr_zone_active_anon
> > 92308030 +8.7% 1.004e+08 proc-vmstat.numa_hit
> > 92171407 +8.7% 1.002e+08 proc-
> > vmstat.numa_local
> > 133616 +2.7% 137265 proc-
> > vmstat.numa_other
> > 92394313 +8.7% 1.004e+08 proc-
> > vmstat.pgalloc_normal
> > 91035691 +7.8% 98094626 proc-vmstat.pgfree
> > 867815 +11.8% 970369 hackbench.throughput
> > 830278 +11.6% 926834
> > hackbench.throughput_avg
> > 867815 +11.8% 970369
> > hackbench.throughput_best
> > 760822 +14.2% 869145
> > hackbench.throughput_worst
> > 72.87 -10.3% 65.36
> > hackbench.time.elapsed_time
> > 72.87 -10.3% 65.36
> > hackbench.time.elapsed_time.max
> > 2.493e+08 -17.7% 2.052e+08
> > hackbench.time.involuntary_context_switches
> > 12357 -3.9% 11879
> > hackbench.time.percent_of_cpu_this_job_got
> > 8029 -14.8% 6842
> > hackbench.time.system_time
> > 976.58 -5.5% 923.21
> > hackbench.time.user_time
> > 7.54e+08 -14.4% 6.451e+08
> > hackbench.time.voluntary_context_switches
> > 5.598e+10 +6.6% 5.965e+10 perf-stat.i.branch-
> > instructions
> > 0.40 -0.0 0.38 perf-stat.i.branch-
> > miss-rate%
> > 8.36 ± 2% +4.6 12.98 ± 3% perf-stat.i.cache-
> > miss-rate%
> > 2.11e+09 -33.8% 1.396e+09 perf-stat.i.cache-
> > references
> > 13687653 -3.4% 13225338 perf-stat.i.context-
> > switches
> > 1.36 -7.9% 1.25 perf-stat.i.cpi
> > 3.219e+11 -2.2% 3.147e+11 perf-stat.i.cpu-
> > cycles
> > 1915 ± 2% -6.6% 1788 ± 3% perf-stat.i.cycles-
> > between-cache-misses
> > 2.371e+11 +6.0% 2.512e+11 perf-
> > stat.i.instructions
> > 0.74 +8.5% 0.80 perf-stat.i.ipc
> > 1.15 ± 14% -28.3% 0.82 ± 23% perf-stat.i.major-
> > faults
> > 115.09 -3.2% 111.40 perf-
> > stat.i.metric.K/sec
> > 0.37 -0.0 0.35 perf-
> > stat.overall.branch-miss-rate%
> > 8.15 ± 3% +4.6 12.74 ± 3% perf-
> > stat.overall.cache-miss-rate%
> > 1.36 -7.7% 1.25 perf-
> > stat.overall.cpi
> > 1875 ± 2% -5.5% 1772 ± 4% perf-
> > stat.overall.cycles-between-cache-misses
> > 0.74 +8.3% 0.80 perf-
> > stat.overall.ipc
> > 5.524e+10 +6.4% 5.877e+10 perf-stat.ps.branch-
> > instructions
> > 2.079e+09 -33.9% 1.375e+09 perf-stat.ps.cache-
> > references
> > 13486088 -3.4% 13020988 perf-
> > stat.ps.context-switches
> > 3.175e+11 -2.3% 3.101e+11 perf-stat.ps.cpu-
> > cycles
> > 2.34e+11 +5.8% 2.475e+11 perf-
> > stat.ps.instructions
> > 1.09 ± 14% -28.3% 0.78 ± 21% perf-stat.ps.major-
> > faults
> > 1.73e+13 -5.1% 1.642e+13 perf-
> > stat.total.instructions
> > 3527725 +10.7% 3905361
> > sched_debug.cfs_rq:/.avg_vruntime.avg
> > 3975260 +14.1% 4535959 ± 6%
> > sched_debug.cfs_rq:/.avg_vruntime.max
> > 98657 ± 17% +84.9% 182407 ± 18%
> > sched_debug.cfs_rq:/.avg_vruntime.stddev
> > 11.83 ± 7% +17.6% 13.92 ± 5%
> > sched_debug.cfs_rq:/.h_nr_queued.max
> > 2.71 ± 5% +21.8% 3.30 ± 4%
> > sched_debug.cfs_rq:/.h_nr_queued.stddev
> > 11.75 ± 7% +17.7% 13.83 ± 6%
> > sched_debug.cfs_rq:/.h_nr_runnable.max
> > 2.68 ± 4% +21.2% 3.25 ± 5%
> > sched_debug.cfs_rq:/.h_nr_runnable.stddev
> > 4556 ±223% +691.0% 36039 ± 34%
> > sched_debug.cfs_rq:/.left_deadline.avg
> > 583131 ±223% +577.3% 3949548 ± 4%
> > sched_debug.cfs_rq:/.left_deadline.max
> > 51341 ±223% +622.0% 370695 ± 16%
> > sched_debug.cfs_rq:/.left_deadline.stddev
> > 4555 ±223% +691.0% 36035 ± 34%
> > sched_debug.cfs_rq:/.left_vruntime.avg
> > 583105 ±223% +577.3% 3949123 ± 4%
> > sched_debug.cfs_rq:/.left_vruntime.max
> > 51338 ±223% +622.0% 370651 ± 16%
> > sched_debug.cfs_rq:/.left_vruntime.stddev
> > 3527725 +10.7% 3905361
> > sched_debug.cfs_rq:/.min_vruntime.avg
> > 3975260 +14.1% 4535959 ± 6%
> > sched_debug.cfs_rq:/.min_vruntime.max
> > 98657 ± 17% +84.9% 182407 ± 18%
> > sched_debug.cfs_rq:/.min_vruntime.stddev
> > 0.22 ± 5% +13.9% 0.25 ± 5%
> > sched_debug.cfs_rq:/.nr_queued.stddev
> > 4555 ±223% +691.0% 36035 ± 34%
> > sched_debug.cfs_rq:/.right_vruntime.avg
> > 583105 ±223% +577.3% 3949123 ± 4%
> > sched_debug.cfs_rq:/.right_vruntime.max
> > 51338 ±223% +622.0% 370651 ± 16%
> > sched_debug.cfs_rq:/.right_vruntime.stddev
> > 1336 ± 7% +50.8% 2014 ± 6%
> > sched_debug.cfs_rq:/.runnable_avg.stddev
> > 552.53 ± 8% +19.6% 660.87 ± 5%
> > sched_debug.cfs_rq:/.util_est.avg
> > 384.27 ± 9% +28.9% 495.43 ± 11%
> > sched_debug.cfs_rq:/.util_est.stddev
> > 1328 ± 17% +42.7% 1896 ± 13%
> > sched_debug.cpu.curr->pid.stddev
> > 11.75 ± 8% +19.1% 14.00 ± 6%
> > sched_debug.cpu.nr_running.max
> > 2.71 ± 5% +22.7% 3.33 ± 4%
> > sched_debug.cpu.nr_running.stddev
> > 76578 ± 9% +33.7% 102390 ± 5%
> > sched_debug.cpu.nr_switches.stddev
> > 62.25 ± 7% +17.9% 73.42 ± 7%
> > sched_debug.cpu.nr_uninterruptible.max
> > 8.11 ± 58% -82.0% 1.46 ± 47% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.anon_pipe_write
> > 12.04 ±104% -86.8% 1.58 ± 55% perf-
> > sched.sch_delay.avg.ms.__cond_resched.__mutex_lock.constprop.0.anon
> > _pipe_write
> > 0.11 ±123% -95.3% 0.01 ±102% perf-
> > sched.sch_delay.avg.ms.__cond_resched.down_write_killable.map_vdso.
> > load_elf_binary.exec_binprm
> > 0.06 ±103% -93.6% 0.00 ±154% perf-
> > sched.sch_delay.avg.ms.__cond_resched.filemap_read.__kernel_read.ex
> > ec_binprm.bprm_execve
> > 0.10 ±109% -93.9% 0.01 ±163% perf-
> > sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.vma_link
> > 1.00 ± 21% -59.6% 0.40 ± 50% perf-
> > sched.sch_delay.avg.ms.__cond_resched.mutex_lock.anon_pipe_read.vfs
> > _read.ksys_read
> > 14.54 ± 14% -79.2% 3.02 ± 51% perf-
> > sched.sch_delay.avg.ms.__cond_resched.mutex_lock.anon_pipe_write.vf
> > s_write.ksys_write
> > 1.50 ± 84% -74.1% 0.39 ± 90% perf-
> > sched.sch_delay.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem
> > _alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> > 1.13 ± 68% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.38 ± 97% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 1.10 ± 17% -68.9% 0.34 ± 49% perf-
> > sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall
> > _64
> > 42.25 ± 18% -71.7% 11.96 ± 53% perf-
> > sched.sch_delay.avg.ms.anon_pipe_write.vfs_write.ksys_write.do_sysc
> > all_64
> > 3.25 ± 17% -77.5% 0.73 ± 49% perf-
> > sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 29.17 ± 33% -62.0% 11.09 ± 85% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown]
> > 46.25 ± 15% -68.8% 14.43 ± 52% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 3.72 ± 70% -81.0% 0.70 ± 67% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown]
> > 7.95 ± 55% -69.7% 2.41 ± 65% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown].[unknown]
> > 3.66 ±139% -97.1% 0.11 ± 58% perf-
> > sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule
> > _timeout.constprop.0.do_poll
> > 3.05 ± 44% -91.9% 0.25 ± 57% perf-
> > sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_read
> > 29.96 ± 9% -83.6% 4.90 ± 48% perf-
> > sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_write
> > 26.20 ± 59% -88.9% 2.92 ± 66% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 0.14 ± 84% -91.2% 0.01 ±142% perf-
> > sched.sch_delay.max.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.__pmd_alloc
> > 0.20 ±149% -97.5% 0.01 ±102% perf-
> > sched.sch_delay.max.ms.__cond_resched.down_write_killable.map_vdso.
> > load_elf_binary.exec_binprm
> > 0.11 ±144% -96.6% 0.00 ±154% perf-
> > sched.sch_delay.max.ms.__cond_resched.filemap_read.__kernel_read.ex
> > ec_binprm.bprm_execve
> > 0.19 ±118% -96.7% 0.01 ±163% perf-
> > sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.vma_link
> > 274.64 ± 95% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 3.72 ±151% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 3135 ± 5% -48.6% 1611 ± 57% perf-
> > sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 1320 ± 19% -78.6% 282.01 ± 74% perf-
> > sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown]
> > 265.55 ± 82% -77.9% 58.70 ±124% perf-
> > sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_read
> > 1850 ± 28% -59.1% 757.74 ± 68% perf-
> > sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_write
> > 766.85 ± 56% -68.0% 245.51 ± 51% perf-
> > sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 1.77 ± 17% -71.9% 0.50 ± 49% perf-
> > sched.total_sch_delay.average.ms
> > 5.15 ± 17% -69.5% 1.57 ± 48% perf-
> > sched.total_wait_and_delay.average.ms
> > 3.38 ± 17% -68.2% 1.07 ± 48% perf-
> > sched.total_wait_time.average.ms
> > 5100 ± 3% -31.0% 3522 ± 47% perf-
> > sched.total_wait_time.max.ms
> > 27.42 ± 49% -85.2% 4.07 ± 47% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__alloc_frozen_pages_nop
> > rof.alloc_pages_mpol.alloc_pages_noprof.anon_pipe_write
> > 35.29 ± 80% -85.8% 5.00 ± 51% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__mutex_lock.constprop.0
> > .anon_pipe_write
> > 42.28 ± 14% -79.4% 8.70 ± 51% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.anon_pipe_wri
> > te.vfs_write.ksys_write
> > 3.12 ± 17% -66.4% 1.05 ± 48% perf-
> > sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_sy
> > scall_64
> > 122.62 ± 18% -70.4% 36.26 ± 53% perf-
> > sched.wait_and_delay.avg.ms.anon_pipe_write.vfs_write.ksys_write.do
> > _syscall_64
> > 250.26 ± 65% -94.2% 14.56 ± 55% perf-
> > sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entr
> > y_SYSCALL_64_after_hwframe
> > 9.37 ± 17% -78.2% 2.05 ± 48% perf-
> > sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe.[unknown]
> > 58.34 ± 33% -62.0% 22.18 ± 85% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_a
> > pic_timer_interrupt.[unknown]
> > 134.44 ± 15% -69.3% 41.24 ± 52% perf-
> > sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_a
> > pic_timer_interrupt.[unknown].[unknown]
> > 86.94 ± 6% -83.1% 14.68 ± 48% perf-
> > sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.
> > constprop.0.anon_pipe_write
> > 86.57 ± 39% -86.0% 12.14 ± 59% perf-
> > sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp
> > _kthread.kthread
> > 647.92 ± 48% -97.9% 13.86 ± 45% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 6386 ± 6% -46.8% 3397 ± 57% perf-
> > sched.wait_and_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe.[unknown]
> > 3868 ± 27% -60.4% 1531 ± 67% perf-
> > sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.
> > constprop.0.anon_pipe_write
> > 1647 ± 55% -67.7% 531.51 ± 50% perf-
> > sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp
> > _kthread.kthread
> > 5014 ± 5% -32.5% 3385 ± 47% perf-
> > sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork
> > .ret_from_fork_asm
> > 19.31 ± 47% -86.5% 2.61 ± 49% perf-
> > sched.wait_time.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.anon_pipe_write
> > 23.25 ± 70% -85.3% 3.42 ± 52% perf-
> > sched.wait_time.avg.ms.__cond_resched.__mutex_lock.constprop.0.anon
> > _pipe_write
> > 18.33 ± 15% -42.0% 10.64 ± 49% perf-
> > sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move
> > _task.__set_cpus_allowed_ptr.__sched_setaffinity
> > 0.11 ±123% -95.3% 0.01 ±102% perf-
> > sched.wait_time.avg.ms.__cond_resched.down_write_killable.map_vdso.
> > load_elf_binary.exec_binprm
> > 0.06 ±103% -93.6% 0.00 ±154% perf-
> > sched.wait_time.avg.ms.__cond_resched.filemap_read.__kernel_read.ex
> > ec_binprm.bprm_execve
> > 0.10 ±109% -93.9% 0.01 ±163% perf-
> > sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.vma_link
> > 1.70 ± 21% -52.6% 0.81 ± 48% perf-
> > sched.wait_time.avg.ms.__cond_resched.mutex_lock.anon_pipe_read.vfs
> > _read.ksys_read
> > 27.74 ± 15% -79.5% 5.68 ± 51% perf-
> > sched.wait_time.avg.ms.__cond_resched.mutex_lock.anon_pipe_write.vf
> > s_write.ksys_write
> > 2.17 ± 75% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.42 ± 97% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 2.02 ± 17% -65.1% 0.70 ± 48% perf-
> > sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall
> > _64
> > 80.37 ± 18% -69.8% 24.31 ± 52% perf-
> > sched.wait_time.avg.ms.anon_pipe_write.vfs_write.ksys_write.do_sysc
> > all_64
> > 210.13 ± 68% -95.1% 10.21 ± 55% perf-
> > sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYS
> > CALL_64_after_hwframe
> > 6.12 ± 17% -78.5% 1.32 ± 48% perf-
> > sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 29.17 ± 33% -62.0% 11.09 ± 85% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown]
> > 88.19 ± 16% -69.6% 26.81 ± 52% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 13.77 ± 45% -65.7% 4.72 ± 53% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown].[unknown]
> > 104.64 ± 42% -76.4% 24.74 ±135% perf-
> > sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_fro
> > m_fork_asm
> > 5.16 ± 29% -92.5% 0.39 ± 48% perf-
> > sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_read
> > 56.98 ± 5% -82.9% 9.77 ± 48% perf-
> > sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_write
> > 60.36 ± 32% -84.7% 9.22 ± 57% perf-
> > sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 619.88 ± 43% -98.0% 12.52 ± 45% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.14 ± 84% -91.2% 0.01 ±142% perf-
> > sched.wait_time.max.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.__pmd_alloc
> > 740.14 ± 35% -68.5% 233.31 ± 83% perf-
> > sched.wait_time.max.ms.__cond_resched.__alloc_frozen_pages_noprof.a
> > lloc_pages_mpol.alloc_pages_noprof.anon_pipe_write
> > 0.20 ±149% -97.5% 0.01 ±102% perf-
> > sched.wait_time.max.ms.__cond_resched.down_write_killable.map_vdso.
> > load_elf_binary.exec_binprm
> > 0.11 ±144% -96.6% 0.00 ±154% perf-
> > sched.wait_time.max.ms.__cond_resched.filemap_read.__kernel_read.ex
> > ec_binprm.bprm_execve
> > 0.19 ±118% -96.7% 0.01 ±163% perf-
> > sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_a
> > lloc_nodes.mas_preallocate.vma_link
> > 327.64 ± 71% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 3.72 ±151% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.irqentry_exit_t
> > o_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> > 3299 ± 6% -40.7% 1957 ± 51% perf-
> > sched.wait_time.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 436.75 ± 39% -76.9% 100.85 ± 98% perf-
> > sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_read
> > 2112 ± 19% -62.3% 796.34 ± 63% perf-
> > sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.const
> > prop.0.anon_pipe_write
> > 947.83 ± 46% -58.8% 390.83 ± 53% perf-
> > sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 5014 ± 5% -32.5% 3385 ± 47% perf-
> > sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_
> > from_fork_asm
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-ivb-2ep2: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2
> > @ 2.70GHz (Ivy Bridge-EP) with 64G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/t
> > esttime:
> > gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-
> > 20240206.cgz/lkp-ivb-2ep2/shell_rtns_2/aim9/300s
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 11036 +85.7% 20499 meminfo.PageTables
> > 125.17 ± 8% +18.4% 148.17 ± 7% perf-c2c.HITM.local
> > 30464 +18.7% 36160
> > sched_debug.cpu.nr_switches.avg
> > 9166 +19.8% 10985 vmstat.system.cs
> > 6623 ± 17% +60.8% 10652 ± 5% numa-
> > meminfo.node0.PageTables
> > 4414 ± 26% +123.2% 9853 ± 6% numa-
> > meminfo.node1.PageTables
> > 1653 ± 17% +60.1% 2647 ± 5% numa-
> > vmstat.node0.nr_page_table_pages
> > 1097 ± 26% +123.9% 2457 ± 6% numa-
> > vmstat.node1.nr_page_table_pages
> > 319.08 -2.2% 312.04
> > aim9.shell_rtns_2.ops_per_sec
> > 27170926 -2.2% 26586121
> > aim9.time.minor_page_faults
> > 1051038 -2.2% 1027732
> > aim9.time.voluntary_context_switches
> > 2736 +86.4% 5101 proc-
> > vmstat.nr_page_table_pages
> > 28014 +1.3% 28378 proc-
> > vmstat.nr_slab_unreclaimable
> > 19332129 -1.5% 19048363 proc-vmstat.numa_hit
> > 19283853 -1.5% 18996609 proc-
> > vmstat.numa_local
> > 19892794 -1.5% 19598065 proc-
> > vmstat.pgalloc_normal
> > 28044189 -2.1% 27457289 proc-vmstat.pgfault
> > 19843766 -1.5% 19543091 proc-vmstat.pgfree
> > 419715 -5.7% 395688 ± 8% proc-vmstat.pgreuse
> > 2682 -2.0% 2628 proc-
> > vmstat.unevictable_pgs_culled
> > 0.07 ± 6% -30.5% 0.05 ± 22% perf-
> > sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthr
> > ead.kthread
> > 0.03 ± 6% +36.0% 0.04 perf-
> > sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.07 ± 33% -57.5% 0.03 ± 53% perf-
> > sched.sch_delay.max.ms.__cond_resched.__wait_for_common.wait_for_co
> > mpletion_state.kernel_clone.__x64_sys_vfork
> > 0.02 ± 74% +112.0% 0.05 ± 36% perf-
> > sched.sch_delay.max.ms.__cond_resched.down_read.walk_component.link
> > _path_walk.path_openat
> > 0.02 +24.1% 0.02 ± 2% perf-
> > sched.total_sch_delay.average.ms
> > 27.52 -14.0% 23.67 perf-
> > sched.total_wait_and_delay.average.ms
> > 23179 +18.3% 27421 perf-
> > sched.total_wait_and_delay.count.ms
> > 27.50 -14.0% 23.65 perf-
> > sched.total_wait_time.average.ms
> > 117.03 ± 3% -72.4% 32.27 ± 2% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 1655 ± 2% +282.0% 6324 perf-
> > sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 0.96 ± 29% +51.6% 1.45 ± 22% perf-
> > sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.i
> > sra.0
> > 117.00 ± 3% -72.5% 32.23 ± 2% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 5.93 +0.1 6.00 perf-stat.i.branch-
> > miss-rate%
> > 9189 +19.8% 11011 perf-stat.i.context-
> > switches
> > 1.96 +1.6% 1.99 perf-stat.i.cpi
> > 71.21 +60.6% 114.39 ± 4% perf-stat.i.cpu-
> > migrations
> > 0.53 -1.5% 0.52 perf-stat.i.ipc
> > 3.79 -2.1% 3.71 perf-
> > stat.i.metric.K/sec
> > 90998 -2.1% 89084 perf-stat.i.minor-
> > faults
> > 90998 -2.1% 89084 perf-stat.i.page-
> > faults
> > 5.99 +0.1 6.06 perf-
> > stat.overall.branch-miss-rate%
> > 1.79 +1.4% 1.82 perf-
> > stat.overall.cpi
> > 0.56 -1.3% 0.55 perf-
> > stat.overall.ipc
> > 9158 +19.8% 10974 perf-
> > stat.ps.context-switches
> > 70.99 +60.6% 114.02 ± 4% perf-stat.ps.cpu-
> > migrations
> > 90694 -2.1% 88787 perf-stat.ps.minor-
> > faults
> > 90695 -2.1% 88787 perf-stat.ps.page-
> > faults
> > 8.155e+11 -1.1% 8.065e+11 perf-
> > stat.total.instructions
> > 8.87 -0.3 8.55 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 8.86 -0.3 8.54 perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 2.53 ± 2% -0.1 2.43 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
> > 2.54 -0.1 2.44 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> > 2.49 -0.1 2.40 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
> > 0.98 ± 5% -0.1 0.90 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYS
> > CALL_64_after_hwframe
> > 0.70 ± 3% -0.1 0.62 ± 6% perf-
> > profile.calltrace.cycles-
> > pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
> > 0.18 ±141% +0.5 0.67 ± 6% perf-
> > profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > 0.18 ±141% +0.5 0.67 ± 6% perf-
> > profile.calltrace.cycles-pp.ret_from_fork_asm
> > 0.00 +0.6 0.59 ± 7% perf-
> > profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> > 62.48 +0.7 63.14 perf-
> > profile.calltrace.cycles-
> > pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_
> > startup_entry
> > 49.10 +0.7 49.78 perf-
> > profile.calltrace.cycles-
> > pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.d
> > o_idle
> > 67.62 +0.8 68.43 perf-
> > profile.calltrace.cycles-pp.common_startup_64
> > 20.14 -0.7 19.40 perf-
> > profile.children.cycles-pp.do_syscall_64
> > 20.18 -0.7 19.44 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 3.33 ± 2% -0.2 3.16 ± 2% perf-
> > profile.children.cycles-pp.vm_mmap_pgoff
> > 3.22 ± 2% -0.2 3.06 perf-
> > profile.children.cycles-pp.do_mmap
> > 3.51 ± 2% -0.1 3.38 perf-
> > profile.children.cycles-pp.do_exit
> > 3.52 ± 2% -0.1 3.38 perf-
> > profile.children.cycles-pp.__x64_sys_exit_group
> > 3.52 ± 2% -0.1 3.38 perf-
> > profile.children.cycles-pp.do_group_exit
> > 3.67 -0.1 3.54 perf-
> > profile.children.cycles-pp.x64_sys_call
> > 2.21 -0.1 2.09 ± 3% perf-
> > profile.children.cycles-pp.__x64_sys_openat
> > 2.07 ± 2% -0.1 1.94 ± 2% perf-
> > profile.children.cycles-pp.path_openat
> > 2.09 ± 2% -0.1 1.97 ± 2% perf-
> > profile.children.cycles-pp.do_filp_open
> > 2.19 -0.1 2.08 ± 3% perf-
> > profile.children.cycles-pp.do_sys_openat2
> > 1.50 ± 4% -0.1 1.39 ± 3% perf-
> > profile.children.cycles-pp.copy_process
> > 2.56 -0.1 2.46 ± 2% perf-
> > profile.children.cycles-pp.exit_mm
> > 2.55 -0.1 2.44 ± 2% perf-
> > profile.children.cycles-pp.__mmput
> > 2.51 ± 2% -0.1 2.41 ± 2% perf-
> > profile.children.cycles-pp.exit_mmap
> > 0.70 ± 3% -0.1 0.62 ± 6% perf-
> > profile.children.cycles-pp.dup_mm
> > 0.94 ± 4% -0.1 0.89 ± 2% perf-
> > profile.children.cycles-pp.__alloc_frozen_pages_noprof
> > 0.57 ± 3% -0.0 0.52 ± 4% perf-
> > profile.children.cycles-pp.alloc_pages_noprof
> > 0.20 ± 12% -0.0 0.15 ± 10% perf-
> > profile.children.cycles-pp.perf_event_task_tick
> > 0.18 ± 4% -0.0 0.14 ± 15% perf-
> > profile.children.cycles-pp.xas_find
> > 0.10 ± 12% -0.0 0.07 ± 24% perf-
> > profile.children.cycles-pp.up_write
> > 0.09 ± 6% -0.0 0.07 ± 11% perf-
> > profile.children.cycles-pp.tick_check_broadcast_expired
> > 0.08 ± 12% +0.0 0.10 ± 8% perf-
> > profile.children.cycles-pp.hrtimer_try_to_cancel
> > 0.10 ± 13% +0.0 0.13 ± 5% perf-
> > profile.children.cycles-pp.__perf_event_task_sched_out
> > 0.20 ± 8% +0.0 0.23 ± 4% perf-
> > profile.children.cycles-pp.enqueue_entity
> > 0.21 ± 9% +0.0 0.25 ± 4% perf-
> > profile.children.cycles-pp.prepare_task_switch
> > 0.03 ±101% +0.0 0.07 ± 16% perf-
> > profile.children.cycles-pp.run_ksoftirqd
> > 0.04 ± 71% +0.1 0.09 ± 15% perf-
> > profile.children.cycles-pp.kick_pool
> > 0.05 ± 47% +0.1 0.11 ± 16% perf-
> > profile.children.cycles-pp.__queue_work
> > 0.28 ± 5% +0.1 0.34 ± 7% perf-
> > profile.children.cycles-pp.exit_to_user_mode_loop
> > 0.50 +0.1 0.56 ± 2% perf-
> > profile.children.cycles-pp.timerqueue_del
> > 0.04 ± 71% +0.1 0.11 ± 17% perf-
> > profile.children.cycles-pp.queue_work_on
> > 0.51 ± 4% +0.1 0.58 ± 2% perf-
> > profile.children.cycles-pp.enqueue_task_fair
> > 0.32 ± 3% +0.1 0.40 ± 4% perf-
> > profile.children.cycles-pp.ttwu_do_activate
> > 0.53 ± 5% +0.1 0.61 ± 3% perf-
> > profile.children.cycles-pp.enqueue_task
> > 0.49 ± 4% +0.1 0.57 ± 6% perf-
> > profile.children.cycles-pp.schedule
> > 0.28 ± 6% +0.1 0.38 perf-
> > profile.children.cycles-pp.sched_ttwu_pending
> > 0.32 ± 5% +0.1 0.43 ± 2% perf-
> > profile.children.cycles-pp.__flush_smp_call_function_queue
> > 0.35 ± 8% +0.1 0.47 ± 2% perf-
> > profile.children.cycles-pp.flush_smp_call_function_queue
> > 0.17 ± 10% +0.2 0.34 ± 12% perf-
> > profile.children.cycles-pp.worker_thread
> > 0.88 ± 3% +0.2 1.06 ± 4% perf-
> > profile.children.cycles-pp.ret_from_fork
> > 0.88 ± 3% +0.2 1.06 ± 4% perf-
> > profile.children.cycles-pp.ret_from_fork_asm
> > 0.39 ± 6% +0.2 0.59 ± 7% perf-
> > profile.children.cycles-pp.kthread
> > 66.24 +0.6 66.85 perf-
> > profile.children.cycles-pp.cpuidle_idle_call
> > 63.09 +0.6 63.73 perf-
> > profile.children.cycles-pp.cpuidle_enter
> > 62.97 +0.6 63.61 perf-
> > profile.children.cycles-pp.cpuidle_enter_state
> > 67.61 +0.8 68.43 perf-
> > profile.children.cycles-pp.do_idle
> > 67.62 +0.8 68.43 perf-
> > profile.children.cycles-pp.common_startup_64
> > 67.62 +0.8 68.43 perf-
> > profile.children.cycles-pp.cpu_startup_entry
> > 0.37 ± 11% -0.1 0.31 ± 3% perf-
> > profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> > 0.10 ± 13% -0.0 0.06 ± 50% perf-
> > profile.self.cycles-pp.up_write
> > 0.15 ± 4% +0.1 0.22 ± 8% perf-
> > profile.self.cycles-pp.timerqueue_del
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-ivb-2ep2: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2
> > @ 2.70GHz (Ivy Bridge-EP) with 64G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/t
> > esttime:
> > gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-
> > 20240206.cgz/lkp-ivb-2ep2/exec_test/aim9/300s
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 12120 +76.7% 21422 meminfo.PageTables
> > 8543 +26.9% 10840 vmstat.system.cs
> > 6148 ± 11% +89.9% 11678 ± 5% numa-
> > meminfo.node0.PageTables
> > 5909 ± 11% +64.0% 9689 ± 7% numa-
> > meminfo.node1.PageTables
> > 1532 ± 10% +90.5% 2919 ± 5% numa-
> > vmstat.node0.nr_page_table_pages
> > 1468 ± 11% +65.2% 2426 ± 7% numa-
> > vmstat.node1.nr_page_table_pages
> > 2991 +78.0% 5323 proc-
> > vmstat.nr_page_table_pages
> > 32726750 -2.4% 31952115 proc-vmstat.pgfault
> > 1228 -2.6% 1197
> > aim9.exec_test.ops_per_sec
> > 11018 ± 2% +10.5% 12178 ± 2%
> > aim9.time.involuntary_context_switches
> > 31835059 -2.4% 31062527
> > aim9.time.minor_page_faults
> > 736468 -2.9% 715310
> > aim9.time.voluntary_context_switches
> > 0.28 ± 7% +11.3% 0.31 ± 6%
> > sched_debug.cfs_rq:/.h_nr_queued.stddev
> > 0.28 ± 7% +11.3% 0.31 ± 6%
> > sched_debug.cfs_rq:/.nr_queued.stddev
> > 356683 ± 16% +27.0% 453000 ± 9%
> > sched_debug.cpu.avg_idle.min
> > 27620 ± 7% +29.5% 35775
> > sched_debug.cpu.nr_switches.avg
> > 84830 ± 14% +16.3% 98648 ± 4%
> > sched_debug.cpu.nr_switches.max
> > 4563 ± 26% +46.2% 6671 ± 26%
> > sched_debug.cpu.nr_switches.min
> > 0.03 ± 4% -67.3% 0.01 ±141% perf-
> > sched.sch_delay.avg.ms.__cond_resched.mutex_lock.futex_exec_release
> > .exec_mm_release.exec_mmap
> > 0.03 +11.2% 0.03 ± 2% perf-
> > sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.05 ± 28% +61.3% 0.09 ± 21% perf-
> > sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_t
> > imer_interrupt.[unknown].[unknown]
> > 0.10 ± 18% +18.8% 0.12 perf-
> > sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 0.02 ± 3% +18.3% 0.02 ± 2% perf-
> > sched.total_sch_delay.average.ms
> > 28.80 -19.8% 23.10 ± 3% perf-
> > sched.total_wait_and_delay.average.ms
> > 22332 +24.4% 27778 perf-
> > sched.total_wait_and_delay.count.ms
> > 28.78 -19.8% 23.07 ± 3% perf-
> > sched.total_wait_time.average.ms
> > 17.39 ± 10% -15.6% 14.67 ± 4% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine
> > _move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> > 41.02 ± 4% -54.6% 18.64 ± 6% perf-
> > sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret
> > _from_fork_asm
> > 4795 ± 2% +122.5% 10668 perf-
> > sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_
> > from_fork_asm
> > 17.35 ± 10% -15.7% 14.63 ± 4% perf-
> > sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move
> > _task.__set_cpus_allowed_ptr.__sched_setaffinity
> > 0.00 ±141% +400.0% 0.00 ± 44% perf-
> > sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vf
> > s_open
> > 40.99 ± 4% -54.6% 18.61 ± 6% perf-
> > sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from
> > _fork_asm
> > 0.00 ±149% +542.9% 0.03 ± 41% perf-
> > sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vf
> > s_open
> > 5.617e+08 -1.6% 5.529e+08 perf-stat.i.branch-
> > instructions
> > 5.76 +0.1 5.84 perf-stat.i.branch-
> > miss-rate%
> > 8562 +27.0% 10878 perf-stat.i.context-
> > switches
> > 1.87 +2.6% 1.92 perf-stat.i.cpi
> > 78.02 ± 3% +11.8% 87.23 ± 2% perf-stat.i.cpu-
> > migrations
> > 2.792e+09 -1.6% 2.748e+09 perf-
> > stat.i.instructions
> > 0.55 -2.5% 0.54 perf-stat.i.ipc
> > 4.42 -2.4% 4.31 perf-
> > stat.i.metric.K/sec
> > 106019 -2.4% 103509 perf-stat.i.minor-
> > faults
> > 106019 -2.4% 103509 perf-stat.i.page-
> > faults
> > 5.83 +0.1 5.91 perf-
> > stat.overall.branch-miss-rate%
> > 1.72 +2.3% 1.76 perf-
> > stat.overall.cpi
> > 0.58 -2.3% 0.57 perf-
> > stat.overall.ipc
> > 5.599e+08 -1.6% 5.511e+08 perf-stat.ps.branch-
> > instructions
> > 8534 +27.0% 10841 perf-
> > stat.ps.context-switches
> > 77.77 ± 3% +11.8% 86.96 ± 2% perf-stat.ps.cpu-
> > migrations
> > 2.783e+09 -1.6% 2.739e+09 perf-
> > stat.ps.instructions
> > 105666 -2.4% 103164 perf-stat.ps.minor-
> > faults
> > 105666 -2.4% 103164 perf-stat.ps.page-
> > faults
> > 8.386e+11 -1.6% 8.253e+11 perf-
> > stat.total.instructions
> > 7.79 -0.4 7.41 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_
> > 64_after_hwframe.execve
> > 7.75 -0.3 7.47 perf-
> > profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 7.73 -0.3 7.46 perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 2.68 ± 2% -0.2 2.52 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_sysca
> > ll_64
> > 2.68 ± 2% -0.2 2.52 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64
> > _after_hwframe
> > 2.68 ± 2% -0.2 2.52 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe
> > 2.73 ± 2% -0.2 2.57 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 2.60 -0.1 2.46 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.ex
> > ecve.exec_test
> > 2.61 -0.1 2.47 ± 3% perf-
> > profile.calltrace.cycles-pp.execve.exec_test
> > 2.60 -0.1 2.46 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve.exec_test
> > 2.60 -0.1 2.46 ± 3% perf-
> > profile.calltrace.cycles-
> > pp.entry_SYSCALL_64_after_hwframe.execve.exec_test
> > 1.92 ± 3% -0.1 1.79 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
> > 1.92 ± 3% -0.1 1.80 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> > 4.68 -0.1 4.57 perf-
> > profile.calltrace.cycles-pp._Fork
> > 1.88 ± 2% -0.1 1.77 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
> > 2.76 -0.1 2.66 ± 2% perf-
> > profile.calltrace.cycles-pp.exec_test
> > 3.24 -0.1 3.16 perf-
> > profile.calltrace.cycles-
> > pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYS
> > CALL_64_after_hwframe
> > 0.84 ± 4% -0.1 0.77 ± 5% perf-
> > profile.calltrace.cycles-pp.wait4
> > 0.88 ± 7% +0.2 1.09 ± 3% perf-
> > profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> > 0.88 ± 7% +0.2 1.09 ± 3% perf-
> > profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > 0.88 ± 7% +0.2 1.09 ± 3% perf-
> > profile.calltrace.cycles-pp.ret_from_fork_asm
> > 0.46 ± 45% +0.3 0.78 ± 5% perf-
> > profile.calltrace.cycles-
> > pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> > 0.17 ±141% +0.4 0.53 ± 4% perf-
> > profile.calltrace.cycles-
> > pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_seconda
> > ry
> > 0.18 ±141% +0.4 0.54 ± 2% perf-
> > profile.calltrace.cycles-
> > pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.common_s
> > tartup_64
> > 66.08 +0.8 66.85 perf-
> > profile.calltrace.cycles-
> > pp.cpu_startup_entry.start_secondary.common_startup_64
> > 66.08 +0.8 66.85 perf-
> > profile.calltrace.cycles-pp.start_secondary.common_startup_64
> > 66.02 +0.8 66.80 perf-
> > profile.calltrace.cycles-
> > pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
> > 67.06 +0.9 68.00 perf-
> > profile.calltrace.cycles-pp.common_startup_64
> > 21.19 -0.9 20.30 perf-
> > profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 21.15 -0.9 20.27 perf-
> > profile.children.cycles-pp.do_syscall_64
> > 7.92 -0.4 7.53 ± 2% perf-
> > profile.children.cycles-pp.execve
> > 7.94 -0.4 7.56 ± 2% perf-
> > profile.children.cycles-pp.__x64_sys_execve
> > 7.84 -0.4 7.46 ± 2% perf-
> > profile.children.cycles-pp.do_execveat_common
> > 5.51 -0.3 5.25 ± 2% perf-
> > profile.children.cycles-pp.load_elf_binary
> > 3.68 -0.2 3.49 ± 2% perf-
> > profile.children.cycles-pp.__mmput
> > 2.81 ± 2% -0.2 2.63 perf-
> > profile.children.cycles-pp.__x64_sys_exit_group
> > 2.80 ± 2% -0.2 2.62 ± 2% perf-
> > profile.children.cycles-pp.do_exit
> > 2.81 ± 2% -0.2 2.62 ± 2% perf-
> > profile.children.cycles-pp.do_group_exit
> > 2.93 ± 2% -0.2 2.76 ± 2% perf-
> > profile.children.cycles-pp.x64_sys_call
> > 3.60 -0.2 3.44 ± 2% perf-
> > profile.children.cycles-pp.exit_mmap
> > 5.66 -0.1 5.51 perf-
> > profile.children.cycles-pp.__handle_mm_fault
> > 1.94 ± 3% -0.1 1.82 ± 2% perf-
> > profile.children.cycles-pp.exit_mm
> > 2.64 -0.1 2.52 ± 3% perf-
> > profile.children.cycles-pp.vm_mmap_pgoff
> > 2.55 ± 2% -0.1 2.43 ± 3% perf-
> > profile.children.cycles-pp.do_mmap
> > 2.19 ± 2% -0.1 2.08 ± 3% perf-
> > profile.children.cycles-pp.__mmap_region
> > 2.27 -0.1 2.16 ± 2% perf-
> > profile.children.cycles-pp.begin_new_exec
> > 2.79 -0.1 2.69 ± 2% perf-
> > profile.children.cycles-pp.exec_test
> > 0.83 ± 4% -0.1 0.76 ± 6% perf-
> > profile.children.cycles-pp.__mmap_prepare
> > 0.86 ± 4% -0.1 0.78 ± 5% perf-
> > profile.children.cycles-pp.wait4
> > 0.52 ± 5% -0.1 0.45 ± 7% perf-
> > profile.children.cycles-pp.kernel_wait4
> > 0.50 ± 5% -0.1 0.43 ± 6% perf-
> > profile.children.cycles-pp.do_wait
> > 0.88 ± 3% -0.1 0.81 ± 2% perf-
> > profile.children.cycles-pp.kmem_cache_free
> > 0.51 ± 2% -0.1 0.46 ± 6% perf-
> > profile.children.cycles-pp.setup_arg_pages
> > 0.39 ± 2% -0.0 0.34 ± 8% perf-
> > profile.children.cycles-pp.unlink_anon_vmas
> > 0.08 ± 10% -0.0 0.04 ± 71% perf-
> > profile.children.cycles-pp.perf_adjust_freq_unthr_context
> > 0.37 ± 5% -0.0 0.33 ± 3% perf-
> > profile.children.cycles-pp.__memcg_slab_free_hook
> > 0.21 ± 6% -0.0 0.17 ± 5% perf-
> > profile.children.cycles-pp.user_path_at
> > 0.21 ± 3% -0.0 0.18 ± 10% perf-
> > profile.children.cycles-pp.__percpu_counter_sum
> > 0.18 ± 7% -0.0 0.15 ± 5% perf-
> > profile.children.cycles-pp.alloc_empty_file
> > 0.33 ± 5% -0.0 0.30 perf-
> > profile.children.cycles-pp.relocate_vma_down
> > 0.04 ± 45% +0.0 0.08 ± 12% perf-
> > profile.children.cycles-pp.__update_load_avg_se
> > 0.14 ± 7% +0.0 0.18 ± 10% perf-
> > profile.children.cycles-pp.hrtimer_start_range_ns
> > 0.19 ± 9% +0.0 0.24 ± 7% perf-
> > profile.children.cycles-pp.prepare_task_switch
> > 0.02 ±142% +0.0 0.06 ± 23% perf-
> > profile.children.cycles-pp.select_task_rq
> > 0.03 ±100% +0.0 0.08 ± 8% perf-
> > profile.children.cycles-pp.task_contending
> > 0.45 ± 7% +0.1 0.51 ± 3% perf-
> > profile.children.cycles-pp.__pick_next_task
> > 0.14 ± 22% +0.1 0.20 ± 10% perf-
> > profile.children.cycles-pp.kick_pool
> > 0.36 ± 4% +0.1 0.42 ± 4% perf-
> > profile.children.cycles-pp.dequeue_entities
> > 0.36 ± 4% +0.1 0.44 ± 5% perf-
> > profile.children.cycles-pp.dequeue_task_fair
> > 0.15 ± 20% +0.1 0.23 ± 10% perf-
> > profile.children.cycles-pp.__queue_work
> > 0.49 ± 5% +0.1 0.57 ± 7% perf-
> > profile.children.cycles-pp.schedule_idle
> > 0.14 ± 22% +0.1 0.23 ± 9% perf-
> > profile.children.cycles-pp.queue_work_on
> > 0.36 ± 3% +0.1 0.46 ± 9% perf-
> > profile.children.cycles-pp.exit_to_user_mode_loop
> > 0.47 ± 7% +0.1 0.57 ± 7% perf-
> > profile.children.cycles-pp.timerqueue_del
> > 0.30 ± 13% +0.1 0.42 ± 7% perf-
> > profile.children.cycles-pp.ttwu_do_activate
> > 0.23 ± 15% +0.1 0.37 ± 4% perf-
> > profile.children.cycles-pp.flush_smp_call_function_queue
> > 0.18 ± 14% +0.1 0.32 ± 3% perf-
> > profile.children.cycles-pp.sched_ttwu_pending
> > 0.19 ± 13% +0.1 0.34 ± 4% perf-
> > profile.children.cycles-pp.__flush_smp_call_function_queue
> > 0.61 ± 3% +0.2 0.76 ± 5% perf-
> > profile.children.cycles-pp.schedule
> > 1.60 ± 4% +0.2 1.80 ± 2% perf-
> > profile.children.cycles-pp.ret_from_fork_asm
> > 1.60 ± 4% +0.2 1.80 ± 2% perf-
> > profile.children.cycles-pp.ret_from_fork
> > 0.88 ± 7% +0.2 1.09 ± 3% perf-
> > profile.children.cycles-pp.kthread
> > 1.22 ± 3% +0.2 1.45 ± 5% perf-
> > profile.children.cycles-pp.__schedule
> > 0.54 ± 8% +0.2 0.78 ± 5% perf-
> > profile.children.cycles-pp.worker_thread
> > 66.08 +0.8 66.85 perf-
> > profile.children.cycles-pp.start_secondary
> > 67.06 +0.9 68.00 perf-
> > profile.children.cycles-pp.common_startup_64
> > 67.06 +0.9 68.00 perf-
> > profile.children.cycles-pp.cpu_startup_entry
> > 67.06 +0.9 68.00 perf-
> > profile.children.cycles-pp.do_idle
> > 0.08 ± 10% -0.0 0.04 ± 71% perf-
> > profile.self.cycles-pp.perf_adjust_freq_unthr_context
> > 0.04 ± 45% +0.0 0.08 ± 10% perf-
> > profile.self.cycles-pp.__update_load_avg_se
> > 0.14 ± 10% +0.1 0.23 ± 11% perf-
> > profile.self.cycles-pp.timerqueue_del
> >
> >
> >
> > *******************************************************************
> > ********************************
> > lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU
> > @ 2.00GHz (Ice Lake) with 256G memory
> > ===================================================================
> > ======================
> > compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/te
> > st/testcase:
> > gcc-12/performance/1BRD_48G/xfs/x86_64-rhel-9.4/600/debian-12-
> > x86_64-20240206.cgz/lkp-icl-2sp2/sync_disk_rw/aim7
> >
> > commit:
> > baffb12277 ("sched: Add prev_sum_exec_runtime support for RT, DL
> > and SCX classes")
> > f3de761c52 ("sched: Move task_mm_cid_work to mm work_struct")
> >
> > baffb122772da116 f3de761c52148abfb1b4512914f
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 344180 ± 6% -13.0% 299325 ± 9% meminfo.Mapped
> > 9594 ±123% +191.8% 27995 ± 54% numa-
> > meminfo.node1.PageTables
> > 2399 ±123% +191.3% 6989 ± 54% numa-
> > vmstat.node1.nr_page_table_pages
> > 1860734 -5.2% 1763194 vmstat.io.bo
> > 831686 +1.3% 842493 vmstat.system.cs
> > 50372 -5.5% 47609 aim7.jobs-per-min
> > 1435644 +11.5% 1600707
> > aim7.time.involuntary_context_switches
> > 7242 +1.2% 7332
> > aim7.time.percent_of_cpu_this_job_got
> > 5159 +7.1% 5526
> > aim7.time.system_time
> > 33195986 +6.9% 35497140
> > aim7.time.voluntary_context_switches
> > 40987 ± 10% -19.8% 32872 ± 9%
> > sched_debug.cfs_rq:/.avg_vruntime.stddev
> > 40987 ± 10% -19.8% 32872 ± 9%
> > sched_debug.cfs_rq:/.min_vruntime.stddev
> > 605972 ± 2% +14.5% 693922 ± 7%
> > sched_debug.cpu.avg_idle.max
> > 30974 ± 8% -20.9% 24498 ± 15%
> > sched_debug.cpu.avg_idle.min
> > 118758 ± 5% +22.0% 144899 ± 6%
> > sched_debug.cpu.avg_idle.stddev
> > 856253 +1.5% 869009 perf-stat.i.context-
> > switches
> > 3.06 +2.3% 3.13 perf-stat.i.cpi
> > 164824 +7.7% 177546 perf-stat.i.cpu-
> > migrations
> > 7.93 +2.5% 8.13 perf-
> > stat.i.metric.K/sec
> > 3.41 +1.8% 3.47 perf-
> > stat.overall.cpi
> > 1355 +5.8% 1434 ± 4% perf-
> > stat.overall.cycles-between-cache-misses
> > 0.29 -1.8% 0.29 perf-
> > stat.overall.ipc
> > 845412 +1.6% 858925 perf-
> > stat.ps.context-switches
> > 162728 +7.8% 175475 perf-stat.ps.cpu-
> > migrations
> > 4.391e+12 +5.0% 4.609e+12 perf-
> > stat.total.instructions
> > 444798 +6.0% 471383 ± 5% proc-
> > vmstat.nr_active_anon
> > 28190 -2.8% 27402 proc-vmstat.nr_dirty
> > 1231373 +2.3% 1259666 ± 2% proc-
> > vmstat.nr_file_pages
> > 63763 +0.9% 64355 proc-
> > vmstat.nr_inactive_file
> > 86758 ± 6% -12.9% 75546 ± 8% proc-
> > vmstat.nr_mapped
> > 10162 ± 2% +7.2% 10895 ± 3% proc-
> > vmstat.nr_page_table_pages
> > 265229 +10.4% 292795 ± 9% proc-vmstat.nr_shmem
> > 444798 +6.0% 471383 ± 5% proc-
> > vmstat.nr_zone_active_anon
> > 63763 +0.9% 64355 proc-
> > vmstat.nr_zone_inactive_file
> > 28191 -2.8% 27400 proc-
> > vmstat.nr_zone_write_pending
> > 24349 +11.6% 27171 ± 8% proc-vmstat.pgreuse
> > 0.02 ± 3% +11.3% 0.03 ± 2% perf-
> > sched.sch_delay.avg.ms.__cond_resched.down.xlog_write_iclog.xlog_st
> > ate_release_iclog.xlog_write_get_more_iclog_space
> > 0.29 ± 17% -30.7% 0.20 ± 14% perf-
> > sched.sch_delay.avg.ms.__cond_resched.down_read.xfs_file_fsync.xfs_
> > file_buffered_write.vfs_write
> > 0.03 ± 10% +33.5% 0.04 ± 2% perf-
> > sched.sch_delay.avg.ms.__cond_resched.process_one_work.worker_threa
> > d.kthread.ret_from_fork
> > 0.21 ± 32% -100.0% 0.00 perf-
> > sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.16 ± 16% +51.9% 0.24 ± 11% perf-
> > sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 0.22 ± 19% +44.1% 0.32 ± 25% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_f
> > unction_single.[unknown]
> > 0.30 ± 28% -38.7% 0.18 ± 28% perf-
> > sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown]
> > 0.11 ± 5% +12.8% 0.12 ± 4% perf-
> > sched.sch_delay.avg.ms.xlog_cil_force_seq.xfs_log_force_seq.xfs_fil
> > e_fsync.xfs_file_buffered_write
> > 0.08 ± 4% +15.8% 0.09 ± 4% perf-
> > sched.sch_delay.avg.ms.xlog_wait.xlog_force_lsn.xfs_log_force_seq.x
> > fs_file_fsync
> > 0.02 ± 3% +13.7% 0.02 ± 4% perf-
> > sched.sch_delay.avg.ms.xlog_wait_on_iclog.xlog_cil_push_work.proces
> > s_one_work.worker_thread
> > 0.01 ±223% +1289.5% 0.09 ±111% perf-
> > sched.sch_delay.max.ms.__cond_resched.__kmalloc_cache_noprof.xlog_c
> > il_ctx_alloc.xlog_cil_push_work.process_one_work
> > 2.49 ± 40% -43.4% 1.41 ± 50% perf-
> > sched.sch_delay.max.ms.__cond_resched.down_read.xfs_file_fsync.xfs_
> > file_buffered_write.vfs_write
> > 0.76 ± 7% +92.8% 1.46 ± 40% perf-
> > sched.sch_delay.max.ms.__cond_resched.process_one_work.worker_threa
> > d.kthread.ret_from_fork
> > 0.65 ± 41% -100.0% 0.00 perf-
> > sched.sch_delay.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.40 ± 64% +2968.7% 43.04 ± 13% perf-
> > sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 0.63 ± 19% +89.8% 1.19 ± 51% perf-
> > sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_
> > completion_state.kernel_clone
> > 28.67 ± 3% -11.2% 25.45 ± 5% perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.__flus
> > h_workqueue.xlog_cil_push_now.isra
> > 0.80 ± 9% -100.0% 0.00 perf-
> > sched.wait_and_delay.avg.ms.__cond_resched.down.xlog_write_iclog.xl
> > og_state_release_iclog.xlog_write_get_more_iclog_space
> > 5.76 ±107% +152.4% 14.53 ± 10% perf-
> > sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.en
> > try_SYSCALL_64_after_hwframe.[unknown]
> > 8441 -100.0% 0.00 perf-
> > sched.wait_and_delay.count.__cond_resched.down.xlog_write_iclog.xlo
> > g_state_release_iclog.xlog_write_get_more_iclog_space
> > 18.67 ± 71% +108.0% 38.83 ± 5% perf-
> > sched.wait_and_delay.count.__cond_resched.down_read.xlog_cil_commit
> > .__xfs_trans_commit.xfs_trans_commit
> > 116.17 ±105% +1677.8% 2065 ± 5% perf-
> > sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.ent
> > ry_SYSCALL_64_after_hwframe.[unknown]
> > 424.79 ±151% -100.0% 0.00 perf-
> > sched.wait_and_delay.max.ms.__cond_resched.down.xlog_write_iclog.xl
> > og_state_release_iclog.xlog_write_get_more_iclog_space
> > 28.51 ± 3% -11.2% 25.31 ± 4% perf-
> > sched.wait_time.avg.ms.__cond_resched.__wait_for_common.__flush_wor
> > kqueue.xlog_cil_push_now.isra
> > 0.38 ± 59% -79.0% 0.08 ±107% perf-
> > sched.wait_time.avg.ms.__cond_resched.down.xlog_write_iclog.xlog_st
> > ate_release_iclog.xlog_state_get_iclog_space
> > 0.77 ± 9% -56.5% 0.34 ± 3% perf-
> > sched.wait_time.avg.ms.__cond_resched.down.xlog_write_iclog.xlog_st
> > ate_release_iclog.xlog_write_get_more_iclog_space
> > 1.80 ±138% -100.0% 0.00 perf-
> > sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 6.13 ± 93% +133.2% 14.29 ± 10% perf-
> > sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_S
> > YSCALL_64_after_hwframe.[unknown]
> > 1.00 ± 16% -48.1% 0.52 ± 20% perf-
> > sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_resche
> > dule_ipi.[unknown]
> > 0.92 ± 16% -62.0% 0.35 ± 14% perf-
> > sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_s
> > lowpath.down_write.xlog_cil_push_work
> > 0.26 ± 2% -59.8% 0.11 perf-
> > sched.wait_time.avg.ms.xlog_wait_on_iclog.xlog_cil_push_work.proces
> > s_one_work.worker_thread
> > 0.24 ±223% +2180.2% 5.56 ± 83% perf-
> > sched.wait_time.max.ms.__cond_resched.__kmalloc_cache_noprof.xlog_c
> > il_ctx_alloc.xlog_cil_push_work.process_one_work
> > 1.25 ± 77% -79.8% 0.25 ±107% perf-
> > sched.wait_time.max.ms.__cond_resched.down.xlog_write_iclog.xlog_st
> > ate_release_iclog.xlog_state_get_iclog_space
> > 1.78 ± 51% +958.6% 18.82 ±117% perf-
> > sched.wait_time.max.ms.__cond_resched.mempool_alloc_noprof.bio_allo
> > c_bioset.iomap_writepage_map_blocks.iomap_writepage_map
> > 58.48 ± 6% -10.7% 52.22 ± 2% perf-
> > sched.wait_time.max.ms.__cond_resched.mutex_lock.__flush_workqueue.
> > xlog_cil_push_now.isra
> > 10.87 ±192% -100.0% 0.00 perf-
> > sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mo
> > de_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 8.63 ± 27% -63.9% 3.12 ± 29% perf-
> > sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_s
> > lowpath.down_write.xlog_cil_push_work
> >
> >
> >
> >
> >
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and
> > are provided
> > for informational purposes only. Any difference in system hardware
> > or software
> > design or configuration may affect actual performance.
> >
> >
>
Powered by blists - more mailing lists