[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 12 Dec 2014 07:54:11 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Sasha Levin <sasha.levin@...cle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Jones <davej@...hat.com>, Chris Mason <clm@...com>,
Mike Galbraith <umgwanakikbuti@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Dâniel Fraga <fragabr@...il.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: frequent lockups in 3.18rc4
* Sasha Levin <sasha.levin@...cle.com> wrote:
> Right, and it reproduces in 3.10 as well, so it's not really a
> new thing.
>
> What's odd is that I don't remember seeing this bug so long in
> the past, I'll try bisecting trinity rather than the kernel -
> it's the only other thing that changed.
So I think DaveJ mentioned it that Trinity recently changed its
test task count and is now more aggressively loading the system.
Such a change might have made a dormant, resource limits related
bug or load dependent race more likely.
I think at this point it would also be useful to debug the hang
itself directly: using triggered printks and kgdb and drilling
into all the data structures to figure out why the system isn't
progressing.
If the bug triggers in a VM (which your testing uses) the failed
kernel state ought to be a lot more accessible than bare metal.
That it triggers in a VM, and if it's the same bug as DaveJ's,
that also makes the hardware bug theory a lot less likely.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists