linux-kernel - Re: [PATCH] sched: fix infinity loop in update_blocked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtAMjfnNHu_JdDSi1BSMoXKgGkm2+G_QLsUBDAhq1wFvZQ@mail.gmail.com>
Date:   Fri, 28 Dec 2018 18:25:37 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Tejun Heo <tj@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Sargun Dhillon <sargun@...gun.me>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, xiezhipeng1@...wei.com,
        huawei.libin@...wei.com,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Dmitry Adamushko <dmitry.adamushko@...il.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages

On Fri, 28 Dec 2018 at 17:54, Tejun Heo <tj@...nel.org> wrote:
>
> Hello,
>
> On Fri, Dec 28, 2018 at 10:30:07AM +0100, Vincent Guittot wrote:
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index d1907506318a..88b9118b5191 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -7698,7 +7698,8 @@ static void update_blocked_averages(int cpu)
> > >                  * There can be a lot of idle CPU cgroups.  Don't let fully
> > >                  * decayed cfs_rqs linger on the list.
> > >                  */
> > > -               if (cfs_rq_is_decayed(cfs_rq))
> > > +               if (cfs_rq_is_decayed(cfs_rq) &&
> > > +                   rq->tmp_alone_branch == &rq->leaf_cfs_rq_list)
> > >                         list_del_leaf_cfs_rq(cfs_rq);
> >
> > This patch reduces the cases but I don't thinks it's enough because it
> > doesn't cover the case of unregister_fair_sched_group()
> > And we can still break the ordering of the cfs_rq
>
> So, if unregister_fair_sched_group() can corrupt list, the bug is
> there regardless of a9e7f6544b9ce, right?

I don't think so because without a9e7f6544b9ce, the insertion in the
list is done only once and we can't  call unregister_fair_sched_group
while an enqueue is ongoing so tmp_alone_branch always point to
rq->leaf_cfs_rq_list.
a9e7f6544b9ce enables to have tmp_alone_branch not pointing to
rq->leaf_cfs_rq_list

>
> Is there a reason why we're building a dedicated list for avg
> propagation?  AFAICS, it's just doing depth first walk, which can be

we have this list of sched group of the rq to update the load of each
cfs_rq and as a result the load of the task group. This is then used
to compute the share of a task group between CPUs.
This list must be ordered to correctly propagate the updates from leafs to root.

> done without extra space as long as each node has the parent pointer,
> which they do.  Is the dedicated list an optimization?

It prevents to parse and walk all task group struct every time.
Instead, you just have to follow a linked list

Regards,
Vincent
>
> Thanks.
>
> --
> tejun