linux-kernel - Re: [PATCH] sched/fair: handle case of task_h

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAKfTPtDzsryz=V3WKo5zPkvWSagNAh1tr+ZaV5UwXBr7xMQPUQ@mail.gmail.com>
Date:   Thu, 9 Jul 2020 15:52:29 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Valentin Schneider <valentin.schneider@....com>
Subject: Re: [PATCH] sched/fair: handle case of task_h_load() returning 0

On Thu, 9 Jul 2020 at 15:34, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
>
> On 08/07/2020 11:47, Vincent Guittot wrote:
> > On Wed, 8 Jul 2020 at 11:45, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> >>
> >> On 02/07/2020 16:42, Vincent Guittot wrote:
> >>> task_h_load() can return 0 in some situations like running stress-ng
> >>> mmapfork, which forks thousands of threads, in a sched group on a 224 cores
> >>> system. The load balance doesn't handle this correctly because
> >>
> >> I guess the issue here is that 'cfs_rq->h_load' in
> >>
> >> task_h_load() {
> >>     struct cfs_rq *cfs_rq = task_cfs_rq(p);
> >>     ...
> >>     return div64_ul(p->se.avg.load_avg * cfs_rq->h_load,
> >>                     cfs_rq_load_avg(cfs_rq) + 1);
> >> }
> >>
> >> is still ~0 (or at least pretty small) compared to se.avg.load_avg being
> >> 1024 and cfs_rq_load_avg(cfs_rq) n*1024 in these lb occurrences.
> >>
> >>> env->imbalance never decreases and it will stop pulling tasks only after
> >>> reaching loop_max, which can be equal to the number of running tasks of
> >>> the cfs. Make sure that imbalance will be decreased by at least 1.
>
> Looks like it's bounded by sched_nr_migrate (32 on my E5-2690 v2).

yes

>
> env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
>
> [...]
>
> >> I assume that this is related to the LKP mail
> >
> > I have found this problem while studying the regression raised in the
> > email below but it doesn't fix it. At least, it's not enough
> >
> >>
> >> https://lkml.kernel.org/r/20200421004749.GC26573@shao2-debian ?
>
> I see. It also happens with other workloads but it's most visible
> at the beginning of a workload (fork).
>
> Still on E5-2690 v2 (2*2*10, 40 CPUs):
>
> In the taskgroup cfs_rq->h_load is ~ 1024/40 = 25 so this leads to
> task_h_load = 0 with cfs_rq->avg.load_avg 40 times higher than the
> individual task load (1024).
>
> One incarnation of 20 loops w/o any progress (that's w/o your patch).
>
> With loop='loop/loop_break/loop_max'
> and load='p->se.avg.load_avg/cfs_rq->h_load/cfs_rq->avg.load_avg'
>
> Jul  9 10:41:18 e105613-lin kernel: [73.068844] [stress-ng-mmapf 2907] SMT CPU37->CPU17 imb=8 loop=1/32/32 load=1023/23/43006
> Jul  9 10:41:18 e105613-lin kernel: [73.068873] [stress-ng-mmapf 3501] SMT CPU37->CPU17 imb=8 loop=2/32/32 load=1022/23/41983
> Jul  9 10:41:18 e105613-lin kernel: [73.068890] [stress-ng-mmapf 2602] SMT CPU37->CPU17 imb=8 loop=3/32/32 load=1023/23/40960
> ...
> Jul  9 10:41:18 e105613-lin kernel: [73.069136] [stress-ng-mmapf 2520] SMT CPU37->CPU17 imb=8 loop=18/32/32 load=1023/23/25613
> Jul  9 10:41:18 e105613-lin kernel: [73.069144] [stress-ng-mmapf 3107] SMT CPU37->CPU17 imb=8 loop=19/32/32 load=1021/23/24589
> Jul  9 10:41:18 e105613-lin kernel: [73.069149] [stress-ng-mmapf 2672] SMT CPU37->CPU17 imb=8 loop=20/32/32 load=1024/23/23566
> ...
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@....com>
> Tested-by: Dietmar Eggemann <dietmar.eggemann@....com>

Thanks

>
>
>
>
>
>
>