[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260123110306.GA217302@noisy.programming.kicks-ass.net>
Date: Fri, 23 Jan 2026 12:03:06 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mario Roy <marioeroy@...il.com>
Cc: Chris Mason <clm@...a.com>,
Joseph Salisbury <joseph.salisbury@...cle.com>,
Adam Li <adamli@...amperecomputing.com>,
Hazem Mohamed Abuelfotoh <abuehaze@...zon.com>,
Josh Don <joshdon@...gle.com>, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
kprateek.nayak@....com
Subject: Re: [PATCH 4/4] sched/fair: Proportional newidle balance
On Fri, Jan 23, 2026 at 11:50:46AM +0100, Peter Zijlstra wrote:
> On Sun, Jan 18, 2026 at 03:46:22PM -0500, Mario Roy wrote:
> > The patch "Proportional newidle balance" introduced a regression
> > with Linux 6.12.65 and 6.18.5. There is noticeable regression with
> > easyWave testing. [1]
> >
> > The CPU is AMD Threadripper 9960X CPU (24/48). I followed the source
> > to install easyWave [2]. That is fetching the two tar.gz archives.
>
> What is the actual configuration of that chip? Is it like 3*8 or 4*6
> (CCX wise). A quick google couldn't find me the answer :/
Obviously I found it right after sending this. It's a 4x6 config.
Meaning it needs newidle to balance between those 4 domains.
Pratheek -- are you guys still considering that SIS_NODE thing? That
worked really well for workstation chips, but there were some issues on
Epyc or so.
> > #!/bin/bash
> > # CXXFLAGS="-O3 $CXXFLAGS" ./configure
> > # make -j8
> >
> > trap 'rm -f *.ssh *.idx *.log *.sshmax *.time' EXIT
> >
> > OMP_NUM_THREADS=48 ./src/easywave \
> > -grid examples/e2Asean.grd -source examples/BengkuluSept2007.flt \
> > -time 1200
> >
> >
> > Before results with CachyOS 6.12.63-2 and 6.18.3-2 kernels.
>
> So the problem is that 6.12 -> 6.18 is an enormous amount of kernel
> releases :/ This patch in particular was an effort to fix a regression
> caused by:
>
> 155213a2aed4 ("sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails")
>
> I'm thinking that if you revert all 4 patches of this series your
> performance will be even worse?
>
> Anyway, my guess is that somehow this benchmark likes doing newidle even
> if it is often not successful. I'll see if I can reproduce this on one
> of my machine, but that might take a little while.
Powered by blists - more mailing lists