lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Dec 2018 17:15:24 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Sargun Dhillon <sargun@...gun.me>,
        Xie XiuQi <xiexiuqi@...wei.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, xiezhipeng1@...wei.com,
        huawei.libin@...wei.com,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Dmitry Adamushko <dmitry.adamushko@...il.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages

Happy holidays, everyone.

(cc'ing Rik, who has been looking at the scheduler code a lot lately)

On Thu, Dec 27, 2018 at 10:15:17AM -0800, Linus Torvalds wrote:
> [ goes off and looks ]
> 
> Oh. unthrottle_cfs_rq -> enqueue_entity -> list_add_leaf_cfs_rq()
> doesn't actually seem to hold the rq lock at all. It's just called
> under a rcu read lock.

I'm pretty sure enqueue_entity() *has* to be called with rq lock.
unthrottle_cfs_rq() is called from tg_set_cfs_bandwidth(),
distribute_cfs_runtime() and unthrottle_offline_cfs_rqs.  The first
two grabs the rq_lock just around the calls and the last one has a
lockdep assert on the rq_lock.  What am I missing?

> So it all seems to depend on that "on_list" flag for exclusion. Which
> seems fundamentally racy, since it's not protected by a lock.

The only place on_list is accessed without holding rq_lock is
unregister_fair_sched_group().  It's a minor optimization on a
relatively cold path (group destruction), so if it's racy there, I
think we can take out that optimization.  I'd be surprised if anyone
notices that.

That said, I don't think it's broken.  False positive on on_list is
fine and I can't see how a false negative would happen given that the
only event which can set it is the sched entity getting scheduled and
there's no way the removal path can't race against that transition.

> But that still makes me go "how come is this only noticed 18 months
> after the fact"?

Unless I'm totally confused, which is definitely possible, I don't
think there's a race condition and the only bug is the
tmp_alone_branch pointer getting dangled, which maybe doesn't happen
all that much?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ