lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YOQWXVzj5T+AGKWg@hirez.programming.kicks-ass.net>
Date:   Tue, 6 Jul 2021 10:37:49 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Marco Elver <elver@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: RCU vs data_race()

On Mon, Jun 21, 2021 at 06:37:57AM -0700, Paul E. McKenney wrote:
> Because the static inline functions end up in RCU's tranlation units,
> and they do write to other state.  See for example the line marked with
> the asterisk at the end of this email, which is apparently between
> __run_timers() and rcu_torture_reader().  But gdb says that this is
> really between this statement in __run_timers():
> 
> 	base->running_timer = NULL;

(I'm finding that in expire_timers(), not in __run_timers(), and both
are in kernel/time/timer.c)

> And this statement in try_to_del_timer_sync():
> 
> 	if (base->running_timer != timer)
> 
> Because of inline functions, running KCSAN on RCU can detect races in
> other subsystems.  In this case, I could argue for a READ_ONCE() on the
> "if" condition, but last I knew, the rules for timers are that C-language
> writes do not participate in data races.  So maybe I should add that
> READ_ONCE(), but running KCSAN on rcutorture would still complain.

That code is correct as is, AFAICT, so READ_ONCE() is not the right
annotation.  Also, READ_ONCE() would not actively help the situation in
any case, and arguably make the code worse and more confusing.


What happens here is that we read base->running_timer while holding
base->lock. We set base->running_timer to !0 holding that same
base->lock from the timer IRQ context. We clear base->running_timer
*without* base->lock, from the timer IRQ context.

So yes, there's a race between the locked load of base->running_timer
and the timer IRQ on another CPU clearing it.

But since the comparison is an equality test and the only purpose of the
clear is to destroy that equality, any partial clear serves that
purpose.

Now, a partial clear could create a false positive match for another
timer, which in turn could make try_to_del_timer_sync() fail spuriously
I suppose, but any caller needs to be able to deal with failure of that
function, so no harm done there.


*HOWEVER* timer_curr_running() violates all that.. but that looks like
it's only ever used as a diagnostic, so that should be fine too.


Anyway, AFAIU the problem is on the WRITE side, so any READ_ONCE() will
not ever fix anything here. If we want to get rid of the possible
spurious fail in try_to_del_timer_sync() we need to consistently
WRITE/READ_ONCE() all of base->running_timer, if we don't care, we need
a data_race() annotation somewhere, although I'm not currently seeing
that inlining you mention that's needed for this problem to manifest in
the first place.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ