[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHkrQXhRtYi3ydKo@arm.com>
Date: Thu, 17 Jul 2025 18:57:02 +0200
From: Beata Michalska <beata.michalska@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, clm@...a.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 00/12] sched: Address schbench regression
On Thu, Jul 17, 2025 at 03:04:55PM +0200, Beata Michalska wrote:
> Hi Peter,
>
> Below are the results of running the schbench on Altra
> (as a reminder 2-core MC, 2 Numa Nodes, 160 cores)
>
> `Legend:
> - 'Flags=none' means neither TTWU_QUEUE_DEFAULT nor
> TTWU_QUEUE_DELAYED is set (or available).
> - '*…*' marks Top-3 Min & Max, Bottom-3 Std dev, and
> Top-3 90th-percentile values.
>
> Base 6.16-rc5
> Flags=none
> Min=681870.77 | Max=913649.50 | Std=53802.90 | 90th=890201.05
>
> sched/fair: bump sd->max_newidle_lb_cost when newidle balance fails
> Flags=none
> Min=770952.12 | Max=888047.45 | Std=34430.24 | 90th=877347.24
>
> sched/psi: Optimize psi_group_change() cpu_clock() usage
> Flags=none
> Min=748137.65 | Max=936312.33 | Std=56818.23 | 90th=*921497.27*
>
> sched/deadline: Less agressive dl_server handling
> Flags=none
> Min=783621.95 | Max=*944604.67* | Std=43538.64 | 90th=*909961.16*
>
> sched: Optimize ttwu() / select_task_rq()
> Flags=none
> Min=*826038.87* | Max=*1003496.73* | Std=49875.43 | 90th=*971944.88*
>
> sched: Use lock guard in ttwu_runnable()
> Flags=none
> Min=780172.75 | Max=914170.20 | Std=35998.33 | 90th=866095.80
>
> sched: Add ttwu_queue controls
> Flags=TTWU_QUEUE_DEFAULT
> Min=*792430.45* | Max=903422.78 | Std=33582.71 | 90th=887256.68
>
> Flags=none
> Min=*803532.80* | Max=894772.48 | Std=29359.35 | 90th=877920.34
>
> sched: Introduce ttwu_do_migrate()
> Flags=TTWU_QUEUE_DEFAULT
> Min=749824.30 | Max=*965139.77* | Std=57022.47 | 90th=903659.07
>
> Flags=none
> Min=787464.65 | Max=885349.20 | Std=27030.82 | 90th=875750.44
>
> psi: Split psi_ttwu_dequeue()
> Flags=TTWU_QUEUE_DEFAULT
> Min=762960.98 | Max=916538.12 | Std=42002.19 | 90th=876425.84
>
> Flags=none
> Min=773608.48 | Max=920812.87 | Std=42189.17 | 90th=871760.47
>
> sched: Re-arrange __ttwu_queue_wakelist()
> Flags=TTWU_QUEUE_DEFAULT
> Min=702870.58 | Max=835243.42 | Std=44224.02 | 90th=825311.12
>
> Flags=none
> Min=712499.38 | Max=838492.03 | Std=38351.20 | 90th=817135.94
>
> sched: Use lock guard in sched_ttwu_pending()
> Flags=TTWU_QUEUE_DEFAULT
> Min=729080.55 | Max=853609.62 | Std=43440.63 | 90th=838684.48
>
> Flags=none
> Min=708123.47 | Max=850804.48 | Std=40642.28 | 90th=830295.08
>
> sched: Change ttwu_runnable() vs sched_delayed
> Flags=TTWU_QUEUE_DEFAULT
> Min=580218.87 | Max=838684.07 | Std=57078.24 | 90th=792973.33
>
> Flags=none
> Min=721274.90 | Max=784897.92 | Std=*19017.78* | 90th=774792.30
>
> sched: Add ttwu_queue support for delayed tasks
> Flags=none
> Min=712979.48 | Max=830192.10 | Std=33173.90 | 90th=798599.66
>
> Flags=TTWU_QUEUE_DEFAULT
> Min=698094.12 | Max=857627.93 | Std=38294.94 | 90th=789981.59
>
> Flags=TTWU_QUEUE_DEFAULT/TTWU_QUEUE_DELAYED
> Min=683348.77 | Max=782179.15 | Std=25086.71 | 90th=750947.00
>
> Flags=TTWU_QUEUE_DELAYED
> Min=669822.23 | Max=807768.85 | Std=38766.41 | 90th=794052.05
>
> sched: fix ttwu_delayed
This one is actually:
sched: Add ttwu_queue support for delayed tasks
+
https://lore.kernel.org/all/0672c7df-543c-4f3e-829a-46969fad6b34@amd.com/
Apologies for that.
---
BR
Beata
> Flags=none
> Min=671844.35 | Max=798737.67 | Std=33438.64 | 90th=788584.62
>
> Flags=TTWU_QUEUE_DEFAULT
> Min=688607.40 | Max=828679.53 | Std=33184.78 | 90th=782490.23
>
> Flags=TTWU_QUEUE_DEFAULT/TTWU_QUEUE_DELAYED
> Min=579171.13 | Max=643929.18 | Std=*14644.92* | 90th=639764.16
>
> Flags=TTWU_QUEUE_DELAYED
> Min=614265.22 | Max=675172.05 | Std=*13309.92* | 90th=647181.10
>
>
> Best overall performer:
> sched: Optimize ttwu() / select_task_rq()
> Flags=none
> Min=*826038.87* | Max=*1003496.73* | Std=49875.43 | 90th=*971944.88*
>
> Hope this will he somehwat helpful.
>
> ---
> BR
> Beata
>
> On Wed, Jul 02, 2025 at 01:49:24PM +0200, Peter Zijlstra wrote:
> > Hi!
> >
> > Previous version:
> >
> > https://lkml.kernel.org/r/20250520094538.086709102@infradead.org
> >
> >
> > Changes:
> > - keep dl_server_stop(), just remove the 'normal' usage of it (juril)
> > - have the sched_delayed wake list IPIs do select_task_rq() (vingu)
> > - fixed lockdep splat (dietmar)
> > - added a few preperatory patches
> >
> >
> > Patches apply on top of tip/master (which includes the disabling of private futex)
> > and clm's newidle balance patch (which I'm awaiting vingu's ack on).
> >
> > Performance is similar to the last version; as tested on my SPR on v6.15 base:
> >
> > v6.15:
> > schbench-6.15.0-1.txt:average rps: 2891403.72
> > schbench-6.15.0-2.txt:average rps: 2889997.02
> > schbench-6.15.0-3.txt:average rps: 2894745.17
> >
> > v6.15 + patches 1-10:
> > schbench-6.15.0-dirty-4.txt:average rps: 3038265.95
> > schbench-6.15.0-dirty-5.txt:average rps: 3037327.50
> > schbench-6.15.0-dirty-6.txt:average rps: 3038160.15
> >
> > v6.15 + all patches:
> > schbench-6.15.0-dirty-deferred-1.txt:average rps: 3043404.30
> > schbench-6.15.0-dirty-deferred-2.txt:average rps: 3046124.17
> > schbench-6.15.0-dirty-deferred-3.txt:average rps: 3043627.10
> >
> >
> > Patches can also be had here:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core
> >
> >
> > I'm hoping we can get this merged for next cycle so we can all move on from this.
> >
> >
>
Powered by blists - more mailing lists