linux-kernel - Re: [External] Re: [PATCH v2 07/10] sched/fair: use update_load_avg() to attach/detach entity load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtCscXjUoOY2EwSwDZ=Qsx0+TPO_ejP5Fh+ds==_hetMfA@mail.gmail.com>
Date:   Thu, 21 Jul 2022 16:13:50 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Chengming Zhou <zhouchengming@...edance.com>
Cc:     Dietmar Eggemann <dietmar.eggemann@....com>, mingo@...hat.com,
        peterz@...radead.org, rostedt@...dmis.org, bsegall@...gle.com,
        vschneid@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH v2 07/10] sched/fair: use update_load_avg()
 to attach/detach entity load_avg

On Thu, 21 Jul 2022 at 15:56, Chengming Zhou
<zhouchengming@...edance.com> wrote:
>
> On 2022/7/20 23:34, Vincent Guittot wrote:
> > On Wed, 20 Jul 2022 at 15:41, Chengming Zhou
> > <zhouchengming@...edance.com> wrote:
> >>
> >> On 2022/7/19 18:29, Vincent Guittot wrote:
> >>> On Fri, 15 Jul 2022 at 18:21, Chengming Zhou
> >>> <zhouchengming@...edance.com> wrote:
> >>>>
> >
> > ...
> >
> >>>>
> >>>>>
> >>>>> Looks to me that you want to replace this by your `freeze PELT when
> >>>>> outside fair` model.
> >>>>
> >>>> Yes, want to freeze PELT for two !fair cases:
> >>>>
> >>>> 1. !fair task hasn't been fair before: will still have its init load_avg
> >>>>    when switch to fair.
> >>>
> >>> But I'm not sure it makes sense to keep these init values. As an
> >>> example, the util_avg is set according to the cpu utilization at the
> >>> time of the task creation. I would tend to decay them as these init
> >>> values become less and less relevant.
> >>>
> >>> so we should return early in post_init_entity_util_avg() and don't set
> >>> util_avg if sched class is not cfs
> >>
> >> Yes, this indeed is a problem if we attach this init sched_avg of !fair task.
> >> I'm also not sure whether it make sense to keep them to 0 ? Will it cause
> >> unfairness problem between cfs_rqs?
> >
> > Why should it cause an unfairness problem ? !fair tasks are not
> > accounted and their pelt values will be decayed down to 0 after 320ms
> > anyway (with the current implementation). So it's just like if you
> > started directly after those 320ms
>
> Thanks for your patient explain. IMHO, I am thinking if we have init sched_avg
> for new fair task (A), but have 0 for new task switched from !fair (B). Then
> what's the point of init sched_avg for the fair task?

For fair task, the util_avg is set according to cpu's util_avg at fork
time as an estimate of its coming utilization. The load_is also set to
max to ensure an min share. But these init value have a real impact on
the first 320ms. after they have just disappear from the *_avg.

For !fair task, these init value will be decayed when
swicthed_to_fair(). This means that if a !fair task stays more than
320ms in !fair class (running, runnable or sleeping) before switching
into fair class, those init pelt value will be decayed to 0.
If we keep these init value, the initial util_avg value which has been
set according to the cpu's util_avg at fork will stay at this value
which is no more relevant

>
> The B task will need some time to reach its stable load value, so in this process
> its cfs_rq may can't get enough shares? Imaging below scenario, if we have fair
> task A and switched from !fair task B at the same time, could cause unfairness
> between cfs0 and cfs1 ?
>
> CPU0   tg   CPU1
>   |  /    \  |
>   | /      \ |
> cfs0        cfs1
>  (A)         (B)
>
> If runnable_avg and util_avg are 0 when switched from !fair, so we need more time
> to do load balance or CPU frequency adjust? I think it's the reason why we have
> init sched_avg for new fair task. Should we care about these, or it will be no problem?

But the util_avg which has been set at fork time,  has been set
according to a cpu util_avg which is maybe hundreds of seconds old.

>
> I'm not sure, I must have missed something :-)
>
> Thanks!
>
> >
> >>
> >>>
> >>>>
> >>>> 2. !fair task has been switched_from_fair(): will still keep its lastest
> >>>>    load_avg when switch to fair.
> >>>>
> >>>>>
> >>>>>> This patch make update_load_avg() the only location of attach/detach,
> >>>>>> and can handle these corner cases like change cgroup of NEW tasks,
> >>>>>> by checking last_update_time before attach/detach.
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>> @@ -11527,9 +11522,7 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
> >>>>>>      struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >>>>>>
> >>>>>>      /* Catch up with the cfs_rq and remove our load when we leave */
> >>>>>> -    update_load_avg(cfs_rq, se, 0);
> >>>>>> -    detach_entity_load_avg(cfs_rq, se);
> >>>>>> -    update_tg_load_avg(cfs_rq);
> >>>>>> +    update_load_avg(cfs_rq, se, UPDATE_TG | DO_DETACH);
> >>>>>
> >>>>> IMHO, the DO_[DE|AT]TACH comments in update_load_avg() would have to be
> >>>>> updated in this case.
> >>>>
> >>>> Correct, will do.
> >>>>
> >>>> Thanks.
> >>>>
> >>>>>
> >>>>> [...]