lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 21 Dec 2014 19:11:56 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Dave Jones <davej@...emonkey.org.uk>,
	Thomas Gleixner <tglx@...utronix.de>, Chris Mason <clm@...com>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Dâniel Fraga <fragabr@...il.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Suresh Siddha <sbsiddha@...il.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Peter Anvin <hpa@...ux.intel.com>
Subject: Re: frequent lockups in 3.18rc4

On Sun, Dec 21, 2014 at 04:52:28PM -0800, Linus Torvalds wrote:
> On Sun, Dec 21, 2014 at 4:41 PM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > The second time (or third, or fourth - it might not take immediately)
> > you get a lockup or similar. Bad things happen.
> 
> I've only tested it twice now, but the first time I got a weird
> lockup-like thing (things *kind* of worked, but I could imagine that
> one CPU was stuck with a lock held, because things eventually ground
> to a screeching halt.
> 
> The second time I got
> 
>   INFO: rcu_sched self-detected stall on CPU { 5}  (t=84533 jiffies
> g=11971 c=11970 q=17)
> 
> and then
> 
>    INFO: rcu_sched detected stalls on CPUs/tasks: { 1 2 3 4 5 6 7}
> (detected by 0, t=291309 jiffies, g=12031, c=12030, q=57)
> 
> with backtraces that made no sense (because obviously no actual stall
> had taken place), and were the CPU's mostly being idle.

Yep, if time gets messed up too much, RCU can incorrectly decide that
21 seconds have elapsed since the grace period started, and can even
decide this pretty much immediately after the grace period starts.

							Thanx, Paul

> I could easily see it resulting in your softlockup scenario too.
> 
>                           Linus
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists