linux-kernel - Re: [PATCH 0/5] [GIT PULL] updates for tip/tracing/ftrace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 21 Mar 2009 14:01:54 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 0/5] [GIT PULL] updates for tip/tracing/ftrace

On Sat, Mar 21, 2009 at 09:09:19PM +0100, Ingo Molnar wrote:
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> > On Sat, Mar 21, 2009 at 01:25:23PM -0400, Steven Rostedt wrote:
> > > On Sat, 21 Mar 2009, Ingo Molnar wrote:
> > > > * Ingo Molnar <mingo@...e.hu> wrote:

[ . . . ]

> > > > CONFIG_CLASSIC_RCU=y
> > > 
> > > All the crashes you reported only happen with classic RCU.
> > > 
> > > Paul,
> > > 
> > > Did anything change recently that could cause this lockup?
> > 
> > Arjan van de Ven is seeing a problem where a single 
> > synchronize_rcu() during bootup is taking a full second, which is 
> > currently thought to be due to some drivers spinning in the kernel 
> > (Arjan is working on a bootgraph that will hopefully pinpoint the 
> > problem: http://lkml.org/lkml/2009/3/21/7).  If the drivers were 
> > also instrumented with ftrace, they might (or might not)slow down 
> > even further, depending on exactly why they are spinning.
> 
> for one of the hung boxes in the past i waited 24 hours but it never 
> unwedged itself. The box that hung today is still hanging and the 
> RCU stall detector is still busy printing out those backtraces.

And on the last trace you emailed, the first and the last stall warning
are identical according to "diff".  In fact, they are all identical.
That is a bit unusual, one would normally expect to see slight differences
in the stack based on the scheduling clock interrupt hitting the "longer
than average loop" in different places each time.

That would indicate either a very tight loop or a loop that has
interrupts enabled only in one spot.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/