linux-kernel - Re: rcu_prempt stalls / lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140401183245.GA12473@linux.vnet.ibm.com>
Date:	Tue, 1 Apr 2014 11:32:45 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Dave Jones <davej@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: rcu_prempt stalls / lockup

On Tue, Apr 01, 2014 at 02:04:14PM -0400, Dave Jones wrote:
> On Tue, Apr 01, 2014 at 10:55:45AM -0700, Paul E. McKenney wrote:
>  > >  > > so kernel space still works like before, but userspace is locked up.
>  > >  > 
>  > >  > Interesting.  I suspect that if you reverted the rest of this merge
>  > >  > window's RCU patches, you would get the same result.
> 
> Something that occurred to me is that this might be something in the x86 merge
> that's just changing timings enough to expose this problem.
> At some point this evening, I'll try bisecting it if we don't get any closer.

OK.  ;-)

>  > > [ 1953.672735] INFO: Stall ended before state dump start, gp_kthread state: 0x2
>  > > [ 2148.608132] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [ 2148.609140] 	(detected by 0, t=104027 jiffies, g=47728, c=47727, q=0)
>  > > etc etc.
>  > 
>  > Waiting uninterruptibly.  Presumably blocked on mutex_lock().  But
>  > you have CONFIG_PROVE_LOCKING(), so any deadlocks should have been
>  > reported.
> 
> Lockdep had reported something a little earlier (timestamped at 1108.xxxxxx)
> but that's a known false-positive in xfs.

Yep, I would be very surprised if that was related to the grace-period hang.

>  > Given that you have CONFIG_RCU_TRACE=y, could you please enable the
>  > following trace events and dump the trace before things hang?
>  > 
>  > 	trace_event=rcu:rcu_grace_period,rcu:rcu_grace_period_init
>  > 
>  > If it is not feasible to dump the trace before things hang, let me
>  > know, and I will work out some other diagnostic regime.
> 
> I'll give that a shot when I get back in a few hours.

Cool!

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/