linux-kernel - Re: false nr_running check in load balance?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALZhoSRyfE78GbiJqVRvUrbJRq_NhSqYbK8vnxNP8kS9xz9qVQ@mail.gmail.com>
Date:	Tue, 13 Aug 2013 12:45:12 +0800
From:	Lei Wen <adrian.wenl@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Paul Turner <pjt@...gle.com>, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...nel.org>, leiwen@...vell.com
Subject: Re: false nr_running check in load balance?

Peter,

On Mon, Aug 12, 2013 at 10:43 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, Aug 06, 2013 at 09:23:46PM +0800, Lei Wen wrote:
>> Hi Paul,
>>
>> I notice in load_balance function, it would check busiest->nr_running
>> to decide whether to perform the real task movement.
>>
>> But in some case, I saw the nr_running is not matching with
>> the task in the queue, which seems make scheduler to do many redundant
>> checking.
>> What I means is like there is only one task in the queue, but nr_running
>> shows it has two. So if that task cannot be moved, it would be still checked
>> for twice.
>>
>> With further checking, I find there is one patch you submit before:
>> commit 953bfcd10e6f3697233e8e5128c611d275da39c1
>> Author: Paul Turner <pjt@...gle.com>
>> Date:   Thu Jul 21 09:43:27 2011 -0700
>>
>>     sched: Implement hierarchical task accounting for SCHED_OTHER
>>
>> In this patch, you increase nr_running when enqueue enqueue_task_stop,
>> which is the reason nr_running is increase while task not be increased.
>> It is true at that time, the stopper has been waken up and enqueue again
>> into cpu, and do the migration job. So the logic should be right there.
>>
>> My question is whether we could change the judgment into cfs_rq->nr_running?
>> Since the load_balance is only for cfs, right?
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index bb456f4..ffc0d35 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5096,7 +5096,7 @@ redo:
>>         schedstat_add(sd, lb_imbalance[idle], env.imbalance);
>>
>>         ld_moved = 0;
>> -       if (busiest->nr_running > 1) {
>> +       if (busiest->cfs.nr_running > 1) {
>>                 /*
>>                  * Attempt to move tasks. If find_busiest_group has found
>>                  * an imbalance but busiest->nr_running <= 1, the group is
>>
>
> Not quite right; I think you need busiest->cfs.h_nr_running.
> cfs.nr_running is the number of entries running in this 'group'. If
> you've got nested groups like:
>
>  'root'
>    \
>    'A'
>    / \
>   t1 t2
>
> root.nr_running := 1 'A', even though you've got multiple running tasks.
>

You're absolutely right for this. :)
I miss it for not considering the group case...

Then do you think it is necessary to do below change in load_balance() code?
 -       if (busiest->nr_running > 1) {
 +       if (busiest->cfs.h_nr_running > 1) {


Thanks,
Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/