lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA8VjZ-F3vdiKJtpij6PUxu_OR7-54cG6NcS_K7guHF6g@mail.gmail.com>
Date:   Wed, 6 Apr 2022 17:33:28 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     彭志刚 <zgpeng.linux@...il.com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org,
        Benjamin Segall <bsegall@...gle.com>, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Return the busiest group only if imbalance is meaningful

On Wed, 6 Apr 2022 at 17:07, 彭志刚 <zgpeng.linux@...il.com> wrote:
>
> YES. The following are specific scenarios where negative values occur:
>
>
>
> The scheduler domain contains four groups, namely groupA, groupB, groupC, and groupD;
>
> The types and avg_load conditions of the four groups are as follows
>
>
>
> GroupA    TYPE= group_fully_busy     avg_load=10        [local group]
>
>
>
> GroupB    TYPE= group_has_spare     avg_load=1
>
> GroupC    TYPE= group_has_spare     avg_load=1
>
> GroupD    TYPE= group_overloaded    avg_load=20      [busiest group]
>
>
>
> The CPU that calls load_balance is located in groupA, and update_sd_lb_stats will select the busiest group in GroupB, groupC, and
>
> groupD, that is, gorupD. Under this condition, other judgments in the find_busiest_group function will be bypassed and the
>
> calculate_imbalance function will be called. The judgment in the middle of the calculate_imbalance function cannot stop this
>
> situation, and it will go to the imbalance calculate logic at the end of the function.At this time, the domain's avg_load=8, but
>
>  the local_groupthe's avg_load=10; The negative value is therefore generated.

I think that this case should be covered by an additional test in
calculate_imbalance because we should not try to pull load in groupA
if it's already above  the average load. Something like below should
cover this

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9469,6 +9469,16 @@ static inline void calculate_imbalance(struct
lb_env *env, struct sd_lb_stats *s

                sds->avg_load = (sds->total_load * SCHED_CAPACITY_SCALE) /
                                sds->total_capacity;
+
+               /*
+                * Don't pull any tasks if this group is already above the
+                * domain average load.
+                */
+               if (local->avg_load >= sds->avg_load) {
+                       env->imbalance = 0;
+                       return;
+               }
+
                /*
                 * If the local group is more loaded than the selected
                 * busiest group don't try to pull any tasks.

>
>
> Vincent Guittot <vincent.guittot@...aro.org> 于2022年4月6日周三 20:59写道:
>>
>> On Wed, 6 Apr 2022 at 13:23, zgpeng <zgpeng.linux@...il.com> wrote:
>> >
>> > When calculate_imbalance function calculate the imbalance, it may
>> > actually get a negative number. In this case, it is meaningless to
>>
>> We should not return a negative imbalance but I suppose this can
>> happen when we are using the avg_load metrics to calculate imbalance.
>> Have you faced a use case where imbalance is negative ?
>>
>> > return the so-called busiest group and continue to search for the
>> > busiest cpu later. Therefore, only when the imbalance is greater
>> > than 0 should return the busiest group, otherwise return NULL.
>> >
>> > Signed-off-by: zgpeng <zgpeng@...cent.com>
>> > Reviewed-by: Samuel Liao <samuelliao@...cent.com>
>> > ---
>> >  kernel/sched/fair.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index 601f8bd..9f75303 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -9639,7 +9639,7 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
>> >  force_balance:
>> >         /* Looks like there is an imbalance. Compute it */
>> >         calculate_imbalance(env, &sds);
>> > -       return env->imbalance ? sds.busiest : NULL;
>> > +       return env->imbalance > 0 ? sds.busiest : NULL;
>> >
>> >  out_balanced:
>> >         env->imbalance = 0;
>> > --
>> > 2.9.5
>> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ