lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sat, 23 Sep 2017 10:15:58 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org, ngo Molnar <mingo@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        stable@...r.kernel.org
Subject: Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer

On Sat, Sep 23, 2017 at 07:22:04AM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 23:07:37 -0700
> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > OK, how about the following?
> > 
> > 	It turns out that functions called from rcu_irq_enter() can
> > 	be subject to various kinds of tracing, which can result in
> > 	rcu_irq_enter() being invoked recursively.  This recursion
> > 	causes RCU to not have been watching when it should have,
> > 	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
> > 	to rcu_nmi_enter() still resulted in failures because of calls
> > 	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
> > 	rcu_nmi_exit().  Such calls again cause RCU to not be watching
> > 	when it should have been.
> > 
> > 	In particular, the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic when RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter(), as will be the case when tracing uses
> > 	rcu_nmi_enter() to cause RCU to start watching the current CPU.
> > 	In that case, rcu_irq_enter() actually switches RCU back to
> > 	the not-watching state for this CPU, which results in lockdep
> > 	splats complaining about rcu_read_lock() being used on an idle
> > 	(not-watched) CPU.  The first patch of this series addressed
> > 	this problem by having rcu_irq_enter() and rcu_irq_exit()
> > 	refrain from doing anything when rcu_nmi_enter() caused RCU to
> > 	start watching this CPU.  The third patch in this series caused
> > 	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
> > 	as needed, so this fourth patch now removes the rcu_irq_enter()
> > 	and rcu_irq_exit() from within the stack tracer.
> 
> I think it's a bit too much ;-)  I believe it talks too much about the
> RCU internals, and the bug will be lost on us mere mortals.
> 
> > 
> > > Actually, thinking about this more, this doesn't need to go in stable.
> > > As recursive rcu_irq_enter() calls should not hurt, and you now allow
> > > rcu_irq_enter() to be called even after a rcu_nmi_enter() right?  
> > 
> > Yes, it is now the case that rcu_irq_enter() can be called even after
> > an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
> > this and takes an early exit if so.
> > 
> > But what is it about older kernels prevents the tracing-induced recursive
> > calls to rcu_irq_enter()?
> 
> Ah, the bug is if rcu_irq_enter() is called, and the stack trace
> triggers then. OK, lets keep it simple, and just say this.
> 
> 
>     Currently the stack tracer calls rcu_irq_enter() to make sure RCU
>     is watching when it records a stack trace. But if the stack tracer
>     is triggered while tracing inside of a rcu_irq_enter(), calling
>     rcu_irq_enter() unconditionally can be problematic.
> 
>     The reason for having rcu_irq_enter() in the first place has been
>     fixed from within the saving of the stack trace code, and there's no
>     reason for doing it in the stack tracer itself. Just remove it.
> 
> Simple and to the point.

Works for me!

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ