lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aUe/9VB7K1UeyT2/@xsang-OptiPlex-9020>
Date: Sun, 21 Dec 2025 17:37:57 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Shrikanth Hegde <sshegde@...ux.ibm.com>, <oe-lkp@...ts.linux.dev>,
	<lkp@...el.com>, <linux-kernel@...r.kernel.org>, <x86@...nel.org>, "Ingo
 Molnar" <mingo@...nel.org>, Linus Torvalds <torvalds@...ux-foundation.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>, Juri Lelli
	<juri.lelli@...hat.com>, Mel Gorman <mgorman@...e.de>, Valentin Schneider
	<vschneid@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
	<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: Re: [tip:sched/core] [sched/fair] 089d84203a:
 pts.schbench.32.usec,_99.9th_latency_percentile 52.4% regression

hi, Peter Zijlstra,

On Thu, Dec 18, 2025 at 11:20:20AM +0100, Peter Zijlstra wrote:
> On Thu, Dec 18, 2025 at 03:41:55PM +0530, Shrikanth Hegde wrote:
> > On 12/18/25 2:07 PM, Peter Zijlstra wrote:
> > > On Thu, Dec 18, 2025 at 12:59:53PM +0800, kernel test robot wrote:
> > > > 
> > > > 
> > > > Hello,
> > > > 
> > > > kernel test robot noticed a 52.4% regression of pts.schbench.32.usec,_99.9th_latency_percentile on:
> > > > 
> > > > 
> > > > commit: 089d84203ad42bc8fd6dbf41683e162ac6e848cd ("sched/fair: Fold the sched_avg update")
> > > > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
> > > 
> > > Well, that obviously wasn't the intention. Let me pull that patch :/
> > 
> > Is it possible because it missed scaling by se_weight(se) ??
> 
> >  static inline void
> >  enqueue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > -       cfs_rq->avg.load_avg += se->avg.load_avg;
> > -       cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;
> > +       __update_sa(&cfs_rq->avg, load, se->avg.load_avg, se->avg.load_sum);
> >  }
> 
> Ah, indeed, something like so then? Can the robot (Oliver/Philip)
> verify?

sorry for late. it happened that the server (Cascade Lake) used for original
bisect/report was converted for other usage. so we pick another server:

  test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory

to reproduce the regression (another improvement actually) then tested your
patch below. based on the results, the performance restored to be similar to
38a68b982d (parent of 089d84203a)
(both regression and improvement have lower pecentage on this Granite Rapids
server)

Tested-by: kernel test robot <oliver.sang@...el.com>

for the regression
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/32/2/debian-12-x86_64-phoronix/lkp-gnr-2sp3/schbench-1.1.0/pts

commit: 
  38a68b982d ("<linux/compiler_types.h>: Add the __signed_scalar_typeof() helper")
  089d84203a ("sched/fair: Fold the sched_avg update")
  d936730940 ("sched/fair: Fix sched_avg fold")

38a68b982dd0b10e 089d84203ad42bc8fd6dbf41683 d936730940bfff3f3b22770cfe9
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
     18752           +16.0%      21749            -0.1%      18730        pts.schbench.32.usec,_50.0th_latency_percentile
     29493 ±  2%     +17.3%      34581            +0.3%      29568        pts.schbench.32.usec,_75.0th_latency_percentile
     40810           +15.2%      46997            +0.3%      40938        pts.schbench.32.usec,_90.0th_latency_percentile
     84437            +7.2%      90496            +1.1%      85376        pts.schbench.32.usec,_99.9th_latency_percentile

full comparison is as below [1]

BTW, in our original report https://lore.kernel.org/all/202512181208.753b9f6e-lkp@intel.com/
we also reported an improvment still on that Cascade Lake server.

+------------------+-----------------------------------------------------------------------------------------------+
| testcase: change | pts: pts.stress-ng.Semaphores.bogo_ops_s 17.0% improvement                                    |
| test machine     | 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory |
| test parameters  | cpufreq_governor=performance                                                                  |
|                  | option_a=Semaphores                                                                           |
|                  | test=stress-ng-1.11.0                                                                         |
+------------------+-----------------------------------------------------------------------------------------------+

in our tests, the performance also restored by your patch.

=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/Semaphores/debian-12-x86_64-phoronix/lkp-gnr-2sp3/stress-ng-1.11.0/pts

commit: 
  38a68b982d ("<linux/compiler_types.h>: Add the __signed_scalar_typeof() helper")
  089d84203a ("sched/fair: Fold the sched_avg update")
  d936730940 ("sched/fair: Fix sched_avg fold")

38a68b982dd0b10e 089d84203ad42bc8fd6dbf41683 d936730940bfff3f3b22770cfe9
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
 3.533e+08           +11.3%  3.934e+08            +1.1%  3.573e+08        pts.stress-ng.Semaphores.bogo_ops_s

the full comparison is as below [2]


> 
> (I was going to shelf it and look at it after the holidays, but if this
> is it, we can get it fixed before I dissapear).
> 
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 76f5e4b78b30..7377f9117501 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3775,13 +3775,15 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  static inline void
>  enqueue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
> -	__update_sa(&cfs_rq->avg, load, se->avg.load_avg, se->avg.load_sum);
> +	__update_sa(&cfs_rq->avg, load, se->avg.load_avg,
> +		    se_weight(se) * se->avg.load_sum);
>  }
>  
>  static inline void
>  dequeue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
> -	__update_sa(&cfs_rq->avg, load, -se->avg.load_avg, -se->avg.load_sum);
> +	__update_sa(&cfs_rq->avg, load, -se->avg.load_avg,
> +		    se_weight(se) * -se->avg.load_sum);
>  }
>  
>  static void place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags);

[1]
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/32/2/debian-12-x86_64-phoronix/lkp-gnr-2sp3/schbench-1.1.0/pts

commit: 
  38a68b982d ("<linux/compiler_types.h>: Add the __signed_scalar_typeof() helper")
  089d84203a ("sched/fair: Fold the sched_avg update")
  d936730940 ("sched/fair: Fix sched_avg fold")

38a68b982dd0b10e 089d84203ad42bc8fd6dbf41683 d936730940bfff3f3b22770cfe9
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
      0.06 ±  2%      +0.0        0.07            -0.0        0.06 ±  2%  mpstat.cpu.all.soft%
     58411            -7.1%      54274            -0.3%      58244        vmstat.system.cs
    291141            -1.2%     287740            -0.5%     289788        vmstat.system.in
   2018606            -9.1%    1834308            -0.1%    2015969        time.involuntary_context_switches
      7153            -1.0%       7079            +0.1%       7162        time.user_time
    607814            -3.1%     589005            -0.0%     607669        time.voluntary_context_switches
     18752           +16.0%      21749            -0.1%      18730        pts.schbench.32.usec,_50.0th_latency_percentile
     29493 ±  2%     +17.3%      34581            +0.3%      29568        pts.schbench.32.usec,_75.0th_latency_percentile
     40810           +15.2%      46997            +0.3%      40938        pts.schbench.32.usec,_90.0th_latency_percentile
     84437            +7.2%      90496            +1.1%      85376        pts.schbench.32.usec,_99.9th_latency_percentile
   2018606            -9.1%    1834308            -0.1%    2015969        pts.time.involuntary_context_switches
      7153            -1.0%       7079            +0.1%       7162        pts.time.user_time
    607814            -3.1%     589005            -0.0%     607669        pts.time.voluntary_context_switches
      0.27 ±  4%      -0.1%       0.27 ±  5%      +7.9%       0.29 ±  2%  perf-stat.i.MPKI
     17.00 ±  2%      -0.8       16.23            +0.4       17.44 ±  2%  perf-stat.i.cache-miss-rate%
     60336            -7.5%      55810            -0.9%      59784        perf-stat.i.context-switches
 5.501e+11            -1.5%  5.419e+11            -0.6%  5.468e+11        perf-stat.i.cpu-cycles
      6222           +14.4%       7119            -0.7%       6178        perf-stat.i.cpu-migrations
    154698            +3.3%     159875 ±  2%      -0.1%     154599 ±  4%  perf-stat.i.cycles-between-cache-misses
    891574            -3.0%     864626            -1.4%     879513        perf-stat.i.dTLB-store-misses
      4.84 ±  5%      +6.0%       5.13            +3.2%       4.99        perf-stat.i.major-faults
      2.15            -1.6%       2.11            -0.6%       2.13        perf-stat.i.metric.GHz
     60586            -6.9%      56433            -0.8%      60112        perf-stat.ps.context-switches
      6258           +15.3%       7212            -0.6%       6221        perf-stat.ps.cpu-migrations
    888660            -2.5%     866297            -1.2%     877931        perf-stat.ps.dTLB-store-misses
      4.49 ±  5%      +4.3%       4.68            +3.0%       4.62        perf-stat.ps.major-faults
      3.94 ± 92%      -3.2        0.71 ±147%      -1.9        2.00 ±134%  perf-profile.calltrace.cycles-pp.arch_show_interrupts.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read
      0.43 ±142%      +1.7        2.11 ± 42%      +1.6        2.07 ± 61%  perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.calltrace.cycles-pp.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter
      1.08 ± 77%      +2.0        3.08 ± 25%      +1.7        2.80 ± 77%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      1.08 ± 77%      +2.0        3.08 ± 25%      +1.7        2.80 ± 77%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     12.35 ± 67%     +10.7       23.07 ± 14%      -0.1       12.28 ±101%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.09 ± 71%     +11.0       23.07 ± 14%      +0.2       12.28 ±101%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.94 ± 92%      -3.2        0.71 ±147%      -1.9        2.00 ±134%  perf-profile.children.cycles-pp.arch_show_interrupts
      0.24 ±223%      +1.3        1.55 ±  9%      +0.2        0.42 ±141%  perf-profile.children.cycles-pp.__irq_exit_rcu
      0.24 ±223%      +1.3        1.55 ±  9%      +0.2        0.42 ±141%  perf-profile.children.cycles-pp.handle_softirqs
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.acpi_idle_do_entry
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.acpi_idle_enter
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.acpi_safe_halt
      1.08 ± 77%      +1.7        2.82 ± 34%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.pv_native_safe_halt
      1.08 ± 77%      +2.0        3.08 ± 25%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.cpuidle_enter
      1.08 ± 77%      +2.0        3.08 ± 25%      +1.7        2.80 ± 77%  perf-profile.children.cycles-pp.cpuidle_enter_state
      1.24 ±147%      +3.2        4.44 ± 37%      +0.9        2.14 ± 80%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.02 ± 25%    +209.1%       0.05 ± 26%     +21.2%       0.02 ± 27%  perf-sched.sch_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      0.25 ±  6%     +36.2%       0.34 ± 10%      +0.1%       0.25 ±  6%  perf-sched.sch_delay.avg.ms.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown]
      0.25 ±  7%     +35.8%       0.34 ±  9%      -0.5%       0.25 ±  6%  perf-sched.sch_delay.avg.ms.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.21 ± 74%     -83.8%       0.03 ±136%     +92.6%       0.40 ±145%  perf-sched.sch_delay.avg.ms.irqentry_exit.asm_sysvec_call_function.[unknown].[unknown]
      0.16 ± 12%     +31.2%       0.21 ± 16%      +4.4%       0.17 ± 12%  perf-sched.sch_delay.avg.ms.irqentry_exit.asm_sysvec_reschedule_ipi.[unknown]
      7.88 ± 62%     -82.6%       1.38 ±165%    +195.7%      23.31 ±174%  perf-sched.sch_delay.max.ms.irqentry_exit.asm_sysvec_call_function.[unknown].[unknown]
      0.17 ±  7%     +37.0%       0.23 ± 13%      +0.8%       0.17 ±  8%  perf-sched.total_sch_delay.average.ms
    144.28 ± 10%    +122.9%     321.60 ± 96%      +0.6%     145.18 ± 11%  perf-sched.total_sch_delay.max.ms
      8.72           +32.1%      11.51            -0.3%       8.69        perf-sched.wait_and_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex.__x64_sys_futex
     25.19 ±  2%     +10.4%      27.81 ±  3%      +0.4%      25.30 ±  2%  perf-sched.wait_and_delay.avg.ms.irqentry_exit.asm_sysvec_call_function_single.[unknown]
     29.05           +11.9%      32.50            +0.1%      29.06        perf-sched.wait_and_delay.avg.ms.irqentry_exit.asm_sysvec_reschedule_ipi.[unknown]
     28.93           +12.4%      32.53            +0.0%      28.94        perf-sched.wait_and_delay.avg.ms.irqentry_exit.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    120587           -11.1%     107144            -0.4%     120153        perf-sched.wait_and_delay.count.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown]
    116894           -11.0%     103989            -0.2%     116681        perf-sched.wait_and_delay.count.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      7275 ± 13%     -47.7%       3808 ± 15%     -11.6%       6429 ± 12%  perf-sched.wait_and_delay.count.irqentry_exit.asm_sysvec_call_function_single.[unknown]
      6801 ± 13%     -48.5%       3500 ± 14%     -10.9%       6059 ± 12%  perf-sched.wait_and_delay.count.irqentry_exit.asm_sysvec_call_function_single.[unknown].[unknown]
    811.07 ± 83%     -92.9%      57.96 ±106%      +7.6%     873.05 ± 93%  perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
    213.79 ±  9%     +41.7%     303.02 ± 23%      +4.5%     223.47 ±  6%  perf-sched.wait_and_delay.max.ms.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown]
      8.27 ± 62%     -81.4%       1.54 ±150%     -35.3%       5.36 ± 66%  perf-sched.wait_time.avg.ms.__cond_resched.migrate_pages_batch.migrate_pages.migrate_misplaced_folio.do_huge_pmd_numa_page
      8.70           +31.7%      11.46            -0.3%       8.67        perf-sched.wait_time.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex.__x64_sys_futex
     24.73 ± 73%    +128.9%      56.61 ± 30%     +12.5%      27.81 ± 57%  perf-sched.wait_time.avg.ms.irqentry_exit.asm_exc_page_fault.[unknown]
     25.06 ±  2%     +10.2%      27.61 ±  3%      +0.4%      25.17 ±  2%  perf-sched.wait_time.avg.ms.irqentry_exit.asm_sysvec_call_function_single.[unknown]
     28.88           +11.8%      32.29            +0.0%      28.89        perf-sched.wait_time.avg.ms.irqentry_exit.asm_sysvec_reschedule_ipi.[unknown]
     28.76           +12.3%      32.31            +0.0%      28.77        perf-sched.wait_time.avg.ms.irqentry_exit.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.98 ±140%     -99.2%       0.01 ± 11%     -32.3%       0.66 ±220%  perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
    203.03 ±  6%     +12.8%     229.11 ±  4%      +2.9%     208.98 ±  4%  perf-sched.wait_time.max.ms.irqentry_exit.asm_sysvec_apic_timer_interrupt.[unknown]
      1196 ±  9%     +32.8%       1589 ± 14%     +10.1%       1317 ± 14%  sched_debug.cfs_rq:/.avg_vruntime.avg
     18675 ± 15%     +48.2%      27668 ± 31%     +20.2%      22441 ± 39%  sched_debug.cfs_rq:/.avg_vruntime.max
      2404 ±  7%     +29.7%       3118 ± 13%      +8.1%       2599 ± 17%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      6.09 ± 45%    +557.7%      40.05 ± 14%     -24.1%       4.62 ± 28%  sched_debug.cfs_rq:/.load_avg.avg
      6.69 ± 34%     -33.3%       4.46 ± 33%     -47.7%       3.49 ± 39%  sched_debug.cfs_rq:/.util_est.avg
     55.43 ± 18%     -19.2%      44.80 ± 17%     -26.6%      40.69 ± 19%  sched_debug.cfs_rq:/.util_est.stddev
      1126 ±  8%     +30.1%       1464 ± 14%      +9.5%       1233 ± 15%  sched_debug.cfs_rq:/.zero_vruntime.avg
     18675 ± 15%     +48.2%      27668 ± 31%     +20.2%      22441 ± 39%  sched_debug.cfs_rq:/.zero_vruntime.max
      2293 ±  7%     +23.8%       2840 ± 11%      +7.7%       2470 ± 19%  sched_debug.cfs_rq:/.zero_vruntime.stddev
      4.69 ± 11%   +5177.9%     247.68 ±  6%      -2.9%       4.56 ± 18%  sched_debug.cfs_rq:/system.slice.load_avg.avg
     33.33 ± 10%   +2518.5%     872.83 ± 14%      -3.5%      32.17 ± 10%  sched_debug.cfs_rq:/system.slice.load_avg.max
      7.52 ± 10%   +2180.0%     171.48 ±  6%      -1.7%       7.39 ± 14%  sched_debug.cfs_rq:/system.slice.load_avg.stddev
     24021 ± 43%     -53.8%      11098 ± 29%      +4.4%      25077 ± 34%  sched_debug.cfs_rq:/system.slice.se->load.weight.avg
    950.50 ± 17%     -86.0%     132.83 ± 22%      -3.2%     919.67 ± 23%  sched_debug.cfs_rq:/system.slice.se->load.weight.min
      1426 ± 14%     +39.1%       1984 ± 11%     +11.7%       1593 ± 20%  sched_debug.cfs_rq:/system.slice.se->vruntime.avg
     17114 ± 14%     +61.8%      27689 ± 31%     +28.8%      22038 ± 41%  sched_debug.cfs_rq:/system.slice.se->vruntime.max
      2553 ±  7%     +45.5%       3714 ± 16%     +15.7%       2954 ± 25%  sched_debug.cfs_rq:/system.slice.se->vruntime.stddev
      2077 ± 52%   +1663.6%      36636 ±  8%     +14.1%       2369 ± 32%  sched_debug.cfs_rq:/system.slice.tg_load_avg.avg
      4280 ± 33%   +1577.3%      71794 ±  3%      +1.4%       4341 ± 23%  sched_debug.cfs_rq:/system.slice.tg_load_avg.max
      1662 ± 73%   +1347.5%      24056 ± 12%     +22.7%       2039 ± 32%  sched_debug.cfs_rq:/system.slice.tg_load_avg.min
    509.03 ± 42%   +2013.2%      10756 ±  9%     -32.2%     345.01 ± 35%  sched_debug.cfs_rq:/system.slice.tg_load_avg.stddev
     14.23 ± 60%   +1647.1%     248.69 ±  6%     +27.0%      18.07 ± 37%  sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.avg
     79.49 ± 68%    +117.9%     173.24 ±  7%     +41.0%     112.06 ± 25%  sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.stddev
      0.75 ± 37%    -198.7%      -0.74          -156.6%      -0.43        sched_debug.cfs_rq:/system.slice/containerd.service.avg_vruntime.max
      0.73 ± 17%     -84.6%       0.11 ±215%     -65.3%       0.25 ±140%  sched_debug.cfs_rq:/system.slice/containerd.service.avg_vruntime.stddev
      1.01 ± 27%     -44.6%       0.56 ± 52%     -16.0%       0.85 ± 17%  sched_debug.cfs_rq:/system.slice/containerd.service.se->avg.load_avg.stddev
      0.75 ± 37%    -198.7%      -0.74          -156.6%      -0.43        sched_debug.cfs_rq:/system.slice/containerd.service.zero_vruntime.max
      0.73 ± 17%     -84.6%       0.11 ±215%     -65.3%       0.25 ±140%  sched_debug.cfs_rq:/system.slice/containerd.service.zero_vruntime.stddev
    281.30 ± 10%     +35.7%     381.81 ±  7%      -1.9%     276.09 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.load_avg.avg
     11670 ±  4%     -24.4%       8826 ±  5%      -3.3%      11288 ±  4%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->load.weight.min
    386.83 ± 26%     +33.2%     515.16 ± 12%     +27.2%     492.08 ± 19%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->vruntime.avg
     42031 ± 10%     +30.9%      55022 ±  8%      -0.5%      41811 ±  8%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.avg
     81249 ±  4%     +39.2%     113112 ±  7%      +0.4%      81550 ±  8%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.max
      9990 ±  6%     +50.0%      14989 ±  8%      +1.8%      10172 ± 15%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.stddev
    280.84 ± 11%     +31.5%     369.28 ±  5%      +0.8%     283.22 ± 14%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg_contrib.avg
      2.91 ± 82%     -67.6%       0.94 ± 34%     -19.0%       2.36 ± 91%  sched_debug.cfs_rq:/system.slice/systemd-journald.service.se->sum_exec_runtime.max
    -19.83           +53.8%     -30.50           +75.6%     -34.83        sched_debug.cpu.nr_uninterruptible.min


[2]
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/Semaphores/debian-12-x86_64-phoronix/lkp-gnr-2sp3/stress-ng-1.11.0/pts

commit: 
  38a68b982d ("<linux/compiler_types.h>: Add the __signed_scalar_typeof() helper")
  089d84203a ("sched/fair: Fold the sched_avg update")
  d936730940 ("sched/fair: Fix sched_avg fold")

38a68b982dd0b10e 089d84203ad42bc8fd6dbf41683 d936730940bfff3f3b22770cfe9
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
      0.04 ±  2%      +0.0        0.04 ±  3%      -0.0        0.03 ±  2%  mpstat.cpu.all.soft%
  1.71e+08 ±  3%     -12.3%    1.5e+08 ±  2%      -0.8%  1.696e+08 ±  2%  vmstat.system.cs
     59578 ± 22%     -16.5%      49752 ±  4%     -24.1%      45240 ± 22%  proc-vmstat.numa_hint_faults
     59099 ± 22%     -16.4%      49416 ±  4%     -24.3%      44745 ± 23%  proc-vmstat.numa_hint_faults_local
 8.289e+09           -13.2%  7.198e+09 ±  2%      -1.8%  8.142e+09 ±  2%  time.involuntary_context_switches
      1715            +2.9%       1764            -0.7%       1703        time.user_time
     56162 ± 15%     +79.7%     100932 ±  5%     +16.7%      65551 ±  9%  time.voluntary_context_switches
 3.533e+08           +11.3%  3.934e+08            +1.1%  3.573e+08        pts.stress-ng.Semaphores.bogo_ops_s
 8.289e+09           -13.2%  7.198e+09 ±  2%      -1.8%  8.142e+09 ±  2%  pts.time.involuntary_context_switches
      1715            +2.9%       1764            -0.7%       1703        pts.time.user_time
     56162 ± 15%     +79.7%     100932 ±  5%     +16.7%      65551 ±  9%  pts.time.voluntary_context_switches
      3.20 ± 74%      -1.8        1.44 ±101%      -2.2        1.00 ± 89%  perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      2.79 ± 62%      -1.4        1.44 ±101%      -1.8        1.00 ± 89%  perf-profile.calltrace.cycles-pp.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
      2.63 ± 63%      -1.2        1.44 ±101%      -1.6        1.00 ± 89%  perf-profile.calltrace.cycles-pp.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      1.41 ± 43%      -0.5        0.86 ± 99%      -1.0        0.43 ±100%  perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write
      0.86 ± 45%      +0.1        0.97 ± 57%      +1.5        2.32 ± 38%  perf-profile.calltrace.cycles-pp.lookup_fast.open_last_lookups.path_openat.do_filp_open.do_sys_openat2
      2.06 ± 40%      +0.4        2.50 ± 22%      +2.2        4.22 ± 49%  perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      2.06 ± 40%      +0.4        2.50 ± 22%      +2.2        4.22 ± 49%  perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      2.06 ± 40%      +0.4        2.50 ± 22%      +2.2        4.22 ± 49%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      2.06 ± 40%      +0.4        2.50 ± 22%      +2.2        4.22 ± 49%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.open64
      1.19 ± 58%      +0.6        1.77 ± 52%      +2.0        3.20 ± 59%  perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
      3.69 ± 30%      +1.4        5.11 ± 47%      +3.4        7.05 ± 29%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.69 ± 30%      +1.4        5.11 ± 47%      +3.4        7.05 ± 29%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
      3.20 ± 74%      -1.8        1.44 ±101%      -2.2        1.00 ± 89%  perf-profile.children.cycles-pp.do_fault
      2.44 ± 36%      -1.5        0.92 ±104%      -1.7        0.72 ± 81%  perf-profile.children.cycles-pp.mutex_unlock
      2.63 ± 63%      -1.2        1.44 ±101%      -1.6        1.00 ± 89%  perf-profile.children.cycles-pp.do_read_fault
      2.63 ± 63%      -1.2        1.44 ±101%      -1.6        1.00 ± 89%  perf-profile.children.cycles-pp.filemap_map_pages
      1.41 ± 43%      -0.5        0.86 ± 99%      -1.0        0.43 ±100%  perf-profile.children.cycles-pp.shmem_add_to_page_cache
      0.18 ±223%      +0.6        0.77 ± 51%      +1.3        1.46 ± 46%  perf-profile.children.cycles-pp.__d_lookup_rcu
      4.21 ± 31%      +1.5        5.69 ± 45%      +3.0        7.19 ± 32%  perf-profile.children.cycles-pp.do_filp_open
      4.21 ± 31%      +1.5        5.69 ± 45%      +3.0        7.19 ± 32%  perf-profile.children.cycles-pp.path_openat
      2.44 ± 36%      -1.5        0.92 ±104%      -1.7        0.72 ± 81%  perf-profile.self.cycles-pp.mutex_unlock
      0.18 ±223%      +0.6        0.77 ± 51%      +1.3        1.46 ± 46%  perf-profile.self.cycles-pp.__d_lookup_rcu
 1.586e+11 ±  3%      +3.7%  1.644e+11            +2.2%  1.621e+11        perf-stat.i.branch-instructions
      0.30 ±  4%      -0.0        0.29            +0.1        0.37        perf-stat.i.branch-miss-rate%
 2.701e+08 ±  3%      -1.2%  2.669e+08           +68.1%   4.54e+08        perf-stat.i.branch-misses
 1.833e+08 ±  4%     -11.7%  1.619e+08 ±  2%      -0.0%  1.832e+08 ±  2%  perf-stat.i.context-switches
      1.64 ±  7%      -4.9%       1.56            -6.0%       1.55 ±  2%  perf-stat.i.cpi
    407.20           +24.9%     508.59 ±  2%      +1.3%     412.44 ±  2%  perf-stat.i.cpu-migrations
      0.02 ±  6%      -0.0        0.02 ±  3%      -0.0        0.02 ±  2%  perf-stat.i.dTLB-load-miss-rate%
  2.08e+11 ±  3%      +3.1%  2.144e+11            +2.2%  2.125e+11        perf-stat.i.dTLB-loads
    741999 ±  5%     -27.0%     541820 ±  8%     -30.3%     516965 ±  2%  perf-stat.i.dTLB-store-misses
 1.219e+11 ±  3%      +3.5%  1.262e+11            +2.6%  1.251e+11        perf-stat.i.dTLB-stores
 7.463e+11 ±  3%      +3.3%  7.707e+11            +2.2%  7.628e+11        perf-stat.i.instructions
      0.92 ±  3%      +3.9%       0.96            +2.4%       0.94        perf-stat.i.metric.G/sec
     12710            -1.8%      12482            -0.8%      12610 ±  2%  perf-stat.i.minor-faults
     12715            -1.8%      12487            -0.8%      12615 ±  2%  perf-stat.i.page-faults
      0.17            -0.0        0.16            +0.1        0.28        perf-stat.overall.branch-miss-rate%
      0.02            -0.0        0.02 ±  3%      -0.0        0.02        perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  5%      -0.0        0.00 ±  8%      -0.0        0.00 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
 1.551e+11 ±  2%      +3.6%  1.606e+11            +2.1%  1.584e+11        perf-stat.ps.branch-instructions
  2.64e+08 ±  3%      -1.2%  2.607e+08           +68.0%  4.434e+08        perf-stat.ps.branch-misses
 1.791e+08 ±  4%     -11.8%   1.58e+08 ±  2%      -0.1%  1.789e+08 ±  2%  perf-stat.ps.context-switches
    398.46           +24.8%     497.28 ±  2%      +1.2%     403.37 ±  2%  perf-stat.ps.cpu-migrations
 2.033e+11 ±  3%      +3.0%  2.095e+11            +2.1%  2.076e+11        perf-stat.ps.dTLB-loads
    725199 ±  5%     -27.0%     529185 ±  8%     -30.4%     504934 ±  2%  perf-stat.ps.dTLB-store-misses
 1.192e+11 ±  3%      +3.4%  1.232e+11            +2.5%  1.222e+11        perf-stat.ps.dTLB-stores
 7.299e+11 ±  2%      +3.2%  7.531e+11            +2.1%  7.454e+11        perf-stat.ps.instructions
     12417            -1.8%      12194            -0.8%      12317 ±  2%  perf-stat.ps.minor-faults
     12422            -1.8%      12199            -0.8%      12322 ±  2%  perf-stat.ps.page-faults
 3.379e+13            +1.5%  3.431e+13            +0.4%  3.394e+13        perf-stat.total.instructions
     21355 ± 17%     +30.6%      27891 ±  9%      -9.7%      19290 ± 14%  sched_debug.cfs_rq:/.avg_vruntime.max
      2572 ±  8%     +23.2%       3167 ± 10%      -7.0%       2391 ±  9%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      3.46 ± 20%   +1005.0%      38.27 ± 20%     +22.4%       4.24 ± 44%  sched_debug.cfs_rq:/.load_avg.avg
    172.33 ± 33%    +353.4%     781.40 ± 10%     +88.5%     324.83 ± 74%  sched_debug.cfs_rq:/.load_avg.max
     14.69 ± 27%    +578.7%      99.70 ± 21%     +66.6%      24.48 ± 59%  sched_debug.cfs_rq:/.load_avg.stddev
      1391 ± 11%     -23.6%       1063 ± 16%      +9.2%       1518 ± 20%  sched_debug.cfs_rq:/system.slice.load.avg
      6227 ±  9%     -22.2%       4846 ± 13%      +4.5%       6509 ± 18%  sched_debug.cfs_rq:/system.slice.load.stddev
      4.52 ±  8%   +5661.0%     260.22 ±  7%      -5.1%       4.28 ± 11%  sched_debug.cfs_rq:/system.slice.load_avg.avg
     41.17 ± 22%   +1960.9%     848.40 ± 12%      -6.1%      38.67 ± 28%  sched_debug.cfs_rq:/system.slice.load_avg.max
      7.68 ±  8%   +2068.9%     166.64 ±  9%      -1.5%       7.57 ± 16%  sched_debug.cfs_rq:/system.slice.load_avg.stddev
     45.50 ± 85%     -70.1%      13.60 ± 46%     +44.7%      65.83 ± 95%  sched_debug.cfs_rq:/system.slice.se->avg.load_avg.max
      6.40 ± 50%     -59.6%       2.58 ± 17%     +26.1%       8.07 ± 87%  sched_debug.cfs_rq:/system.slice.se->avg.load_avg.stddev
     23349 ± 27%     -58.9%       9596 ± 14%      -7.7%      21550 ± 32%  sched_debug.cfs_rq:/system.slice.se->load.weight.avg
    462482 ± 18%     -65.8%     158309 ± 59%      -2.9%     448934 ± 25%  sched_debug.cfs_rq:/system.slice.se->load.weight.max
      1010 ± 10%     -89.4%     107.20 ± 10%      -7.0%     939.50 ± 34%  sched_debug.cfs_rq:/system.slice.se->load.weight.min
     71887 ± 20%     -76.9%      16583 ± 50%      -8.7%      65645 ± 25%  sched_debug.cfs_rq:/system.slice.se->load.weight.stddev
      1677 ± 27%   +2348.6%      41062 ±  8%     +30.5%       2188 ± 35%  sched_debug.cfs_rq:/system.slice.tg_load_avg.avg
      4973 ± 43%   +1480.9%      78625 ±  6%      -8.8%       4538 ± 36%  sched_debug.cfs_rq:/system.slice.tg_load_avg.max
      1361 ± 27%   +1922.4%      27528 ±  6%     +37.7%       1874 ± 42%  sched_debug.cfs_rq:/system.slice.tg_load_avg.min
    554.12 ± 61%   +2079.3%      12075 ± 12%     -12.5%     484.93 ± 46%  sched_debug.cfs_rq:/system.slice.tg_load_avg.stddev
     12.35 ± 36%   +2045.2%     265.00 ±  7%     +15.0%      14.21 ± 45%  sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.avg
     81.21 ± 44%    +112.1%     172.28 ± 11%     +11.4%      90.50 ± 47%  sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.stddev
     -0.34         -1648.0%       5.20 ±122%   -2398.6%       7.72 ±181%  sched_debug.cfs_rq:/system.slice/containerd.service.se->vruntime.min
    271.19 ±  5%     +53.5%     416.15 ± 11%      -3.8%     260.79 ±  8%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.load_avg.avg
      4.52 ±  8%     -21.1%       3.56 ± 14%      -5.3%       4.28 ± 12%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.load_avg.avg
     11670 ±  2%     -27.0%       8517 ±  6%      -2.3%      11398 ±  3%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->load.weight.min
     41222 ±  5%     +48.6%      61266 ± 10%      +1.9%      41999 ± 12%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.avg
     83749 ±  7%     +44.4%     120949 ±  6%      +0.9%      84484 ±  4%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.max
     28726 ±  8%     +52.0%      43652 ± 11%      -1.6%      28263 ± 12%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.min
     10569 ± 14%     +48.9%      15737 ± 13%     +11.8%      11814 ±  8%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg.stddev
    269.98 ±  8%     +48.4%     400.63 ± 12%      -1.7%     265.28 ± 10%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg_contrib.avg


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ