[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfTPtDLUHqChcFwWfPJwLqdkU1jANEYVfHhS6fuD5f0Gy4KRg@mail.gmail.com>
Date: Fri, 5 Dec 2025 16:08:37 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Samuel Wu <wusamuel@...gle.com>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [PATCH] sched/fair: Fix pelt lost idle time detection
On Tue, 2 Dec 2025 at 01:24, Samuel Wu <wusamuel@...gle.com> wrote:
>
> On Wed, Oct 8, 2025 at 6:12 AM Vincent Guittot
> <vincent.guittot@...aro.org> wrote:
> >
> > The check for some lost idle pelt time should be always done when
> > pick_next_task_fair() fails to pick a task and not only when we call it
> > from the fair fast-path.
> >
> > The case happens when the last running task on rq is a RT or DL task. When
> > the latter goes to sleep and the /Sum of util_sum of the rq is at the max
> > value, we don't account the lost of idle time whereas we should.
> >
> > Fixes: 67692435c411 ("sched: Rework pick_next_task() slow-path")
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> > ---
> >
> > I Noticed this while reviewing [1]
> >
> > [1] https://lore.kernel.org/all/20251006105453.648473106@infradead.org/
> >
> > kernel/sched/fair.c | 26 +++++++++++++-------------
> > 1 file changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index b3be1e2749ce..dd0ea01af730 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8920,21 +8920,21 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
> > return p;
> >
> > idle:
> > - if (!rf)
> > - return NULL;
> > -
> > - new_tasks = sched_balance_newidle(rq, rf);
> > + if (rf) {
> > + new_tasks = sched_balance_newidle(rq, rf);
> >
> > - /*
> > - * Because sched_balance_newidle() releases (and re-acquires) rq->lock, it is
> > - * possible for any higher priority task to appear. In that case we
> > - * must re-start the pick_next_entity() loop.
> > - */
> > - if (new_tasks < 0)
> > - return RETRY_TASK;
> > + /*
> > + * Because sched_balance_newidle() releases (and re-acquires)
> > + * rq->lock, it is possible for any higher priority task to
> > + * appear. In that case we must re-start the pick_next_entity()
> > + * loop.
> > + */
> > + if (new_tasks < 0)
> > + return RETRY_TASK;
> >
> > - if (new_tasks > 0)
> > - goto again;
> > + if (new_tasks > 0)
> > + goto again;
> > + }
> >
> > /*
> > * rq is about to be idle, check if we need to update the
> > --
> > 2.43.0
> >
>
> Hi all,
>
> I am seeing a power regression I've observed with this patch. This
The problem is that this patch is about fixing a wrong load tracking
which can be underestimated on systems that become loaded.
> test was performed on Pixel 6 running android-mainline (6.18.0-rc7
> based); all scheduling vendor hooks are disabled, and I'm not seeing
> any obvious sched code differences compared to the vanilla upstream
> kernel. I am still actively working to see if I can find a simpler
> sequence to reproduce this on mainline Linux.
>
> The Wattson tool is reporting an increased average power (~30-40%)
> with the patch vs baseline (patch reverted). This regression
For a use case in particular ?
> correlates with two other metrics:
> 1. Increased residency at higher CPU frequencies
> 2. A significant increase in sugov invocations (at least 10x)
>
> Data in the tables below are collected from a 10s run of a bouncing
> ball animation, with and without the patch.
> +-----------------------------------+--------------+-------------------+
> | | with patch | without patch |
> +-----------------------------------+-------------+--------------------+
> | sugov invocation rate (Hz) | 133.5 | 3.7 |
> +-----------------------------------+-------------+--------------------+
>
> +--------------+----------------------+----------------------+
> | | with patch: | without patch: |
> | Freq (kHz) | time spent (ms) | time spent (ms) |
> +--------------+----------------------+----------------------+
> | 738000 | 4869 | 9869 |
> | 1803000 | 2936 | 68 |
> | 1598000 | 1072 | 0 |
> | 1704000 | 674 | 0 |
> | ... | ... | ... |
> +--------------+----------------------+---------------------+
>
> Thanks!
> Sam
Powered by blists - more mailing lists