[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c78aa476-44c7-4691-ae6b-d4b5ebc83c25@inria.fr>
Date: Mon, 30 Oct 2023 11:02:13 +0100
From: Keisuke Nishimura <keisuke.nishimura@...ia.fr>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org,
Dietmar Eggemann <dietmar.eggemann@....com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Julia Lawall <julia.lawall@...ia.fr>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance
On 30/10/2023 09:05, Vincent Guittot wrote:
> On Mon, 30 Oct 2023 at 05:03, Shrikanth Hegde
> <sshegde@...ux.vnet.ibm.com> wrote:
>>
>>
>>
>> On 10/27/23 10:47 PM, Keisuke Nishimura wrote:
>>> should_we_balance is called for the decision to do load-balancing.
>>> When sched ticks invoke this function, only one CPU should return
>>> true. However, in the current code, two CPUs can return true. The
>>> following situation, where b means busy and i means idle, is an
>>> example because CPU 0 and CPU 2 return true.
>>>
>>> [0, 1] [2, 3]
>>> b b i b
>>>
>>> This fix checks if there exists an idle CPU with busy sibling(s)
>>> after looking for a CPU on an idle core. If some idle CPUs with busy
>>> siblings are found, just the first one should do load-balancing.
>>>
>>
>>> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
>>> Signed-off-by: Keisuke Nishimura <keisuke.nishimura@...ia.fr>
>>> ---
>>> kernel/sched/fair.c | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 2048138ce54b..eff0316d6c7d 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
>>> return cpu == env->dst_cpu;
>>> }
>>>
>>
>>
>> There is comment above this /* Are we the first idle CPU? */
>> Maybe update that comment as /* Are we the first idle core */
>
> I was about to say the same but it's not always true. If we are at SMT
> level, we look for an idle CPU in the core
>
Maybe I should update the comment with the additional contexts:
/*
* Are we the first idle core in a sched_domain not-sharing capacity,
* or the first idle CPU in a sched_domain sharing capacity?
*/
>>
>>> - if (idle_smt == env->dst_cpu)
>>> - return true;
>>> + /* Is there an idle CPU with busy siblings? */
>> nit: We can keep the comment style fixed in this function.
>> /* Are we the first idle CPU with busy siblings */
>>
OK, agreed. Should I create version 2?
thanks,
Keisuke
>>> + if (idle_smt != -1)
>>> + return idle_smt == env->dst_cpu;
>>>
>>> /* Are we the first CPU of this group ? */
>>> return group_balance_cpu(sg) == env->dst_cpu;
>>
>> code changes LGTM
>> Reviewed-by: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
Powered by blists - more mailing lists