lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160318003322.GC4287@linux.vnet.ibm.com>
Date:	Thu, 17 Mar 2016 17:33:22 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Chris Metcalf <cmetcalf@...lanox.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Russell King <linux@....linux.org.uk>,
	Thomas Gleixner <tglx@...utronix.de>,
	Aaron Tomlin <atomlin@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, Andrew Morton <akpm@...l.org>,
	Daniel Thompson <daniel.thompson@...aro.org>, x86@...nel.org,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/4] nmi_backtrace: add more trigger_*_cpu_backtrace()
 methods

On Thu, Mar 17, 2016 at 08:17:59PM -0400, Chris Metcalf wrote:
> On 3/17/2016 6:55 PM, Paul E. McKenney wrote:
> >The RCU stall-warn stack traces can be ugly, agreed.
> >
> >That said, RCU used to use NMI-based stack traces, but switched to the
> >current scheme due to the NMIs having the unfortunate habit of locking
> >things up, which IIRC often meant no stack traces at all.  If I recall
> >correctly, one of the problems was self-deadlock in printk().
> 
> Steven Rostedt enabled the per_cpu printk func support in June 2014, and
> the nmi_backtrace code uses it to just capture printk output to percpu
> buffers, so I think it's going to be a lot more robust than earlier attempts.

That would be a very good thing, give or take the "I think" qualifier.
And assuming that the target CPU is healthy enough to find its way back
to some place that can dump the per-CPU printk buffer.  I might well
be overly paranoid, but I have to suspect that the probability of that
buffer getting dumped is reduced greatly on a CPU that isn't healthy
enough to respond to RCU, though.

But it seems like enabling the experiment might be useful.

"Try enabling the NMI version.  If that doesn't get you your RCU CPU
stall warning stack trace, try the remote-print variant."

Or I suppose we could just do both in succession, just in case their
console was a serial port.  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ