[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <28340138-a00e-47bc-a36f-270a01ac83b4@meta.com>
Date: Mon, 6 Oct 2025 17:23:46 -0400
From: Chris Mason <clm@...a.com>
To: Joseph Salisbury <joseph.salisbury@...cle.com>, clm@...com
Cc: Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>, Ingo Molnar <mingo@...hat.com>,
dietmar.eggemann@....com, Steven Rostedt <rostedt@...dmis.org>,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [REGRESSION][v6.17-rc1]sched/fair: Bump sd->max_newidle_lb_cost
when newidle balance fails
On 10/6/25 4:23 PM, Joseph Salisbury wrote:
> Hi Chris,
>
> During testing, we are seeing a ~6% performance regression with the
> upstream stable v6.12.43 kernel (And Oracle UEK
> 6.12.0-104.43.4.el9uek.x86_64 kernel) when running the Phoronix
> pts/apache benchmark with 100 concurrent requests [0]. The regression
> is seen with the following hardware:
>
> PROCESSOR: Intel Xeon Platinum 8167M Core Count: 8 Thread Count: 16
> Extensions: SSE 4.2 + AVX512CD + AVX2 + AVX + RDRAND + FSGSBASE Cache
> Size: 16 MB Microcode: 0x1 Core Family: Cascade Lake
>
> After performing a bisect, we found that the performance regression was
> introduced by the following commit:
>
> Stable v6.12.43: fc4289233e4b ("sched/fair: Bump sd->max_newidle_lb_cost
> when newidle balance fails")
> Mainline v6.17-rc1: 155213a2aed4 ("sched/fair: Bump
> sd->max_newidle_lb_cost when newidle balance fails")
>
> Reverting this commit causes the performance regression to not exist.
>
> I was hoping to get your feedback, since you are the patch author. Do
> you think gathering any additional data will help diagnose this issue?
Hi everyone,
Peter, we've had a collection of regression reports based on this
change, so it sounds like we need to make it less aggressive, or maybe
we need to make the degrading of the cost number more aggressive?
Joe (and everyone else who has hit this), can I talk you into trying the
drgn from
https://lore.kernel.org/lkml/2fbf24bc-e895-40de-9ff6-5c18b74b4300@meta.com/
I'm curious if it degrades at all or just gets stuck up high.
-chris
Powered by blists - more mailing lists