linux-kernel - Re: [PATCH] sched: fix infinity loop in update_blocked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181228165451.GJ2509588@devbig004.ftw2.facebook.com>
Date:   Fri, 28 Dec 2018 08:54:51 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Sargun Dhillon <sargun@...gun.me>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, xiezhipeng1@...wei.com,
        huawei.libin@...wei.com,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Dmitry Adamushko <dmitry.adamushko@...il.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages

Hello,

On Fri, Dec 28, 2018 at 10:30:07AM +0100, Vincent Guittot wrote:
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d1907506318a..88b9118b5191 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7698,7 +7698,8 @@ static void update_blocked_averages(int cpu)
> >                  * There can be a lot of idle CPU cgroups.  Don't let fully
> >                  * decayed cfs_rqs linger on the list.
> >                  */
> > -               if (cfs_rq_is_decayed(cfs_rq))
> > +               if (cfs_rq_is_decayed(cfs_rq) &&
> > +                   rq->tmp_alone_branch == &rq->leaf_cfs_rq_list)
> >                         list_del_leaf_cfs_rq(cfs_rq);
> 
> This patch reduces the cases but I don't thinks it's enough because it
> doesn't cover the case of unregister_fair_sched_group()
> And we can still break the ordering of the cfs_rq

So, if unregister_fair_sched_group() can corrupt list, the bug is
there regardless of a9e7f6544b9ce, right?

Is there a reason why we're building a dedicated list for avg
propagation?  AFAICS, it's just doing depth first walk, which can be
done without extra space as long as each node has the parent pointer,
which they do.  Is the dedicated list an optimization?

Thanks.

-- 
tejun