[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150203110040.GJ26304@twins.programming.kicks-ass.net>
Date: Tue, 3 Feb 2015 12:00:40 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RFC] Make rcu_dereference_raw() safe for NMI etc.
On Mon, Feb 02, 2015 at 11:55:33AM -0800, Paul E. McKenney wrote:
> As promised/threatened on IRC.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Reverse rcu_dereference_check() conditions
>
> The rcu_dereference_check() family of primitives evaluates the RCU
> lockdep expression first, and only then evaluates the expression passed
> in. This works fine normally, but can potentially fail in environments
> (such as NMI handlers) where lockdep cannot be invoked. The problem is
> that even if the expression passed in is "1", the compiler would need to
> prove that the RCU lockdep expression (rcu_read_lock_held(), for example)
> is free of side effects in order to be able to elide it. Given that
> rcu_read_lock_held() is sometimes separately compiled, the compiler cannot
> always use this optimization.
>
> This commit therefore reverse the order of evaluation, so that the
> expression passed in is evaluated first, and the RCU lockdep expression is
> evaluated only if the passed-in expression evaluated to false, courtesy
> of the C-language short-circuit boolean evaluation rules. This compells
> the compiler to forego executing the RCU lockdep expression in cases
> where the passed-in expression evaluates to "1" at compile time, so that
> (for example) rcu_dereference_raw() can be guaranteed to execute safely
> withing an NMI handler.
My particular worry yesterday was tracing; I was looking at
rcu_read_{,un}lock_notrace() and wondered what would happen if I used
list_for_each_entry_rcu() under it.
_If_ it would indeed do that call, we can end up in:
list_entry_rcu() -> rcu_dereference_raw() -> rcu_dereference_check()
-> rcu_read_lock_held() -> rcu_lockdep_current_cpu_online()
-> preempt_disable()
And preempt_disable() is a traceable thing -- not to mention half the
callstack above doesn't have notrace annotations and would equally
generate function trace events.
Thereby rendering the rcu list ops unsuitable for using under _notrace()
rcu primitives.
So yes, fully agreed on this patch.
Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>
FWIW I think I won't be needing the rcu _notrace() bits (for now), but
it leading to this patch was worth it anyhow ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists