linux-kernel - Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtCviqkifAieSQT2bT0TsDirkEzSW8kn8Kb9uws9q-+E_A@mail.gmail.com>
Date:   Tue, 10 Mar 2020 08:57:41 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     王贇 <yun.wang@...ux.alibaba.com>
Cc:     Ben Segall <bsegall@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mel Gorman <mgorman@...e.de>,
        "open list:SCHEDULER" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is
 too, small

On Tue, 10 Mar 2020 at 04:42, 王贇 <yun.wang@...ux.alibaba.com> wrote:
>
>
>
> On 2020/3/9 下午7:15, Vincent Guittot wrote:
> [snip]
> >>>> -       load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
> >>>> +       load = max(cfs_rq->load.weight, scale_load(cfs_rq->avg.load_avg));
> >>>>
> >>>>         tg_weight = atomic_long_read(&tg->load_avg);
> >>>
> >>> Get the point, but IMHO fix scale_load_down() sounds better, to
> >>> cover all the similar cases, let's first try that way see if it's
> >>> working :-)
> >>
> >> Yeah, that might not be a bad idea as well; it's just that doing this
> >> fix would keep you from losing all your precision (and I'd have to think
> >> if that would result in fairness issues like having all the group ses
> >> having the full tg shares, or something like that).
> >
> > AFAICT, we already have a fairness problem case because
> > scale_load_down is used in calc_delta_fair() so all sched groups that
> > have a weight lower than 1024 will end up with the same increase of
> > their vruntime when running.
> > Then the load_avg is used to balance between rq so load_balance will
> > ensure at least 1 task per CPU but not more because the load_avg which
> > is then used will stay null.
> >
> > That being said, having a min of 2 for scale_load_down will enable us
> > to have the tg->load_avg != 0 so a tg_weight != 0 and each sched group
> > will not have the full shares. But it will make those group completely
> > fair anyway.
> > The best solution would be not to scale down the weight but that's a
> > bigger change
>
> Does that means a changing for all those 'load.weight' related
> calculation, to reserve the scaled weight?

yes, to make sure that calculation still fit in the variable

>
> I suppose u64 is capable for 'cfs_rq.load' to reserve the scaled up load,
> changing all those places could be annoying but still fine.

it's fine but the max number of runnable tasks at the max priority on
a cfs_rq  will decrease from around 4 billion to "only" 4 Million.

>
> However, I'm not quite sure about the benefit, how much more precision
> we'll gain and does that really matters? better to have some testing to
> demonstrate it.

it will ensure a better fairness in a larger range of share value. I
agree that we can wonder if it's worth the effort for those low share
values. Wouldbe interesting to knwo who use such low value and for
which purpose

Regards,
Vincent
>
> Regards,
> Michael Wang
>
>
> >