lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090321210154.GD7148@linux.vnet.ibm.com>
Date:	Sat, 21 Mar 2009 14:01:54 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 0/5] [GIT PULL] updates for tip/tracing/ftrace

On Sat, Mar 21, 2009 at 09:09:19PM +0100, Ingo Molnar wrote:
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> > On Sat, Mar 21, 2009 at 01:25:23PM -0400, Steven Rostedt wrote:
> > > On Sat, 21 Mar 2009, Ingo Molnar wrote:
> > > > * Ingo Molnar <mingo@...e.hu> wrote:

[ . . . ]

> > > > CONFIG_CLASSIC_RCU=y
> > > 
> > > All the crashes you reported only happen with classic RCU.
> > > 
> > > Paul,
> > > 
> > > Did anything change recently that could cause this lockup?
> > 
> > Arjan van de Ven is seeing a problem where a single 
> > synchronize_rcu() during bootup is taking a full second, which is 
> > currently thought to be due to some drivers spinning in the kernel 
> > (Arjan is working on a bootgraph that will hopefully pinpoint the 
> > problem: http://lkml.org/lkml/2009/3/21/7).  If the drivers were 
> > also instrumented with ftrace, they might (or might not)slow down 
> > even further, depending on exactly why they are spinning.
> 
> for one of the hung boxes in the past i waited 24 hours but it never 
> unwedged itself. The box that hung today is still hanging and the 
> RCU stall detector is still busy printing out those backtraces.

And on the last trace you emailed, the first and the last stall warning
are identical according to "diff".  In fact, they are all identical.
That is a bit unusual, one would normally expect to see slight differences
in the stack based on the scheduling clock interrupt hitting the "longer
than average loop" in different places each time.

That would indicate either a very tight loop or a loop that has
interrupts enabled only in one spot.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ