[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAG2KctpO6VKS6GN4QWDji0t92_gNBJ7HjjXrE+6H+RwRXt=iLg@mail.gmail.com>
Date: Mon, 1 Dec 2025 16:23:51 -0800
From: Samuel Wu <wusamuel@...gle.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [PATCH] sched/fair: Fix pelt lost idle time detection
On Wed, Oct 8, 2025 at 6:12 AM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> The check for some lost idle pelt time should be always done when
> pick_next_task_fair() fails to pick a task and not only when we call it
> from the fair fast-path.
>
> The case happens when the last running task on rq is a RT or DL task. When
> the latter goes to sleep and the /Sum of util_sum of the rq is at the max
> value, we don't account the lost of idle time whereas we should.
>
> Fixes: 67692435c411 ("sched: Rework pick_next_task() slow-path")
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>
> I Noticed this while reviewing [1]
>
> [1] https://lore.kernel.org/all/20251006105453.648473106@infradead.org/
>
> kernel/sched/fair.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b3be1e2749ce..dd0ea01af730 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8920,21 +8920,21 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
> return p;
>
> idle:
> - if (!rf)
> - return NULL;
> -
> - new_tasks = sched_balance_newidle(rq, rf);
> + if (rf) {
> + new_tasks = sched_balance_newidle(rq, rf);
>
> - /*
> - * Because sched_balance_newidle() releases (and re-acquires) rq->lock, it is
> - * possible for any higher priority task to appear. In that case we
> - * must re-start the pick_next_entity() loop.
> - */
> - if (new_tasks < 0)
> - return RETRY_TASK;
> + /*
> + * Because sched_balance_newidle() releases (and re-acquires)
> + * rq->lock, it is possible for any higher priority task to
> + * appear. In that case we must re-start the pick_next_entity()
> + * loop.
> + */
> + if (new_tasks < 0)
> + return RETRY_TASK;
>
> - if (new_tasks > 0)
> - goto again;
> + if (new_tasks > 0)
> + goto again;
> + }
>
> /*
> * rq is about to be idle, check if we need to update the
> --
> 2.43.0
>
Hi all,
I am seeing a power regression I've observed with this patch. This
test was performed on Pixel 6 running android-mainline (6.18.0-rc7
based); all scheduling vendor hooks are disabled, and I'm not seeing
any obvious sched code differences compared to the vanilla upstream
kernel. I am still actively working to see if I can find a simpler
sequence to reproduce this on mainline Linux.
The Wattson tool is reporting an increased average power (~30-40%)
with the patch vs baseline (patch reverted). This regression
correlates with two other metrics:
1. Increased residency at higher CPU frequencies
2. A significant increase in sugov invocations (at least 10x)
Data in the tables below are collected from a 10s run of a bouncing
ball animation, with and without the patch.
+-----------------------------------+--------------+-------------------+
| | with patch | without patch |
+-----------------------------------+-------------+--------------------+
| sugov invocation rate (Hz) | 133.5 | 3.7 |
+-----------------------------------+-------------+--------------------+
+--------------+----------------------+----------------------+
| | with patch: | without patch: |
| Freq (kHz) | time spent (ms) | time spent (ms) |
+--------------+----------------------+----------------------+
| 738000 | 4869 | 9869 |
| 1803000 | 2936 | 68 |
| 1598000 | 1072 | 0 |
| 1704000 | 674 | 0 |
| ... | ... | ... |
+--------------+----------------------+---------------------+
Thanks!
Sam
Powered by blists - more mailing lists