[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <xhsmhikjr170z.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Thu, 17 Jul 2025 11:49:32 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Shijie Huang <shijie@...eremail.onmicrosoft.com>, Huang Shijie
<shijie@...amperecomputing.com>, mingo@...hat.com, peterz@...radead.org,
juri.lelli@...hat.com, vincent.guittot@...aro.org
Cc: patches@...erecomputing.com, cl@...ux.com,
Shubhang@...amperecomputing.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: do not scan twice in detach_tasks()
On 17/07/25 10:56, Shijie Huang wrote:
> On 2025/7/16 23:08, Valentin Schneider wrote:
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index b9b4bbbf0af6f..32ae24aa377ca 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -11687,7 +11687,7 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
>> * still unbalanced. ld_moved simply stays zero, so it is
>> * correctly treated as an imbalance.
>> */
>> - env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running);
>> + env.loop_max = min(sysctl_sched_nr_migrate, busiest->cfs.h_nr_queued);
>
> I tested this patch, it did not work. I still can catch lots of
> occurrences of this issue in Specjbb test.
>
>
> IMHO, the root cause of this issue is env.loop_max is set out of the
> rq's lock.
>
> Even we set env.loop_max to busiest->cfs.h_nr_queued, the real tasks
> length still can shrink in
>
> other places.
>
Ah right, and updating the max in detach_tasks() itself isn't a complete
solution if we re-enter it due to LBF_NEED_BREAK. Nevermind then :-)
Powered by blists - more mailing lists