linux-kernel - Re: [PATCH] sched/fair: do not scan twice in detach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <xhsmhikjr170z.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Thu, 17 Jul 2025 11:49:32 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Shijie Huang <shijie@...eremail.onmicrosoft.com>, Huang Shijie
 <shijie@...amperecomputing.com>, mingo@...hat.com, peterz@...radead.org,
 juri.lelli@...hat.com, vincent.guittot@...aro.org
Cc: patches@...erecomputing.com, cl@...ux.com,
 Shubhang@...amperecomputing.com, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: do not scan twice in detach_tasks()

On 17/07/25 10:56, Shijie Huang wrote:
> On 2025/7/16 23:08, Valentin Schneider wrote:
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index b9b4bbbf0af6f..32ae24aa377ca 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -11687,7 +11687,7 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
>>   		 * still unbalanced. ld_moved simply stays zero, so it is
>>   		 * correctly treated as an imbalance.
>>   		 */
>> -		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
>> +		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->cfs.h_nr_queued);
>
> I tested this patch, it did not work. I still can catch lots of 
> occurrences of this issue in Specjbb test.
>
>
> IMHO, the root cause of this issue is env.loop_max is set out of the 
> rq's lock.
>
> Even we set env.loop_max to busiest->cfs.h_nr_queued, the real tasks 
> length still can shrink in
>
> other places.
>

Ah right, and updating the max in detach_tasks() itself isn't a complete
solution if we re-enter it due to LBF_NEED_BREAK. Nevermind then :-)