lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201207122513.GT3021@hirez.programming.kicks-ass.net>
Date:   Mon, 7 Dec 2020 13:25:13 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Marco Elver <elver@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: timers: Move clearing of base::timer_running under base::lock

On Mon, Dec 07, 2020 at 02:10:13AM +0100, Frederic Weisbecker wrote:
> On Sun, Dec 06, 2020 at 10:40:07PM +0100, Thomas Gleixner wrote:
> > syzbot reported KCSAN data races vs. timer_base::timer_running being set to
> > NULL without holding base::lock in expire_timers().
> > 
> > This looks innocent and most reads are clearly not problematic but for a
> > non-RT kernel it's completely irrelevant whether the store happens before
> > or after taking the lock. For an RT kernel moving the store under the lock
> > requires an extra unlock/lock pair in the case that there is a waiter for
> > the timer. But that's not the end of the world and definitely not worth the
> > trouble of adding boatloads of comments and annotations to the code. Famous
> > last words...
> 
> There is another thing I noticed lately wrt. del_timer_sync() VS timer execution:

> Here if the timer has previously executed on CPU 1 and then CPU 0 sees base->running_timer == NULL,
> it will return, assuming the timer has completed. But there is nothing to enforce the fact that x
> will be equal to 1. Enforcing that is a behaviour I would expect in this case since this is a kind
> of "wait for completion" function. But perhaps it doesn't apply here, in fact I have no idea...
> 
> But if we recognize that as an issue, we would need a mirroring load_acquire()/store_release() on
> base->running_timer.

Yeah, I think you're right. del_timer_sync() explicitly states it waits
for completion of the handler, so it isn't weird to then also expect to
be able to observe the results of the handler.

Thomas' patch fixes this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ