linux-kernel - Re: [PATCH] sched: fix infinity loop in update_blocked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181228011524.GF2509588@devbig004.ftw2.facebook.com>
Date:   Thu, 27 Dec 2018 17:15:24 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Sargun Dhillon <sargun@...gun.me>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, xiezhipeng1@...wei.com,
        huawei.libin@...wei.com,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Dmitry Adamushko <dmitry.adamushko@...il.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages

Happy holidays, everyone.

(cc'ing Rik, who has been looking at the scheduler code a lot lately)

On Thu, Dec 27, 2018 at 10:15:17AM -0800, Linus Torvalds wrote:
> [ goes off and looks ]
> 
> Oh. unthrottle_cfs_rq -> enqueue_entity -> list_add_leaf_cfs_rq()
> doesn't actually seem to hold the rq lock at all. It's just called
> under a rcu read lock.

I'm pretty sure enqueue_entity() *has* to be called with rq lock.
unthrottle_cfs_rq() is called from tg_set_cfs_bandwidth(),
distribute_cfs_runtime() and unthrottle_offline_cfs_rqs.  The first
two grabs the rq_lock just around the calls and the last one has a
lockdep assert on the rq_lock.  What am I missing?

> So it all seems to depend on that "on_list" flag for exclusion. Which
> seems fundamentally racy, since it's not protected by a lock.

The only place on_list is accessed without holding rq_lock is
unregister_fair_sched_group().  It's a minor optimization on a
relatively cold path (group destruction), so if it's racy there, I
think we can take out that optimization.  I'd be surprised if anyone
notices that.

That said, I don't think it's broken.  False positive on on_list is
fine and I can't see how a false negative would happen given that the
only event which can set it is the sched entity getting scheduled and
there's no way the removal path can't race against that transition.

> But that still makes me go "how come is this only noticed 18 months
> after the fact"?

Unless I'm totally confused, which is definitely possible, I don't
think there's a race condition and the only bug is the
tmp_alone_branch pointer getting dangled, which maybe doesn't happen
all that much?

Thanks.

-- 
tejun