[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1417806247.4845.1@mail.thefacebook.com>
Date: Fri, 5 Dec 2014 14:04:07 -0500
From: Chris Mason <clm@...com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Dave Jones <davej@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mike Galbraith <umgwanakikbuti@...il.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Dâniel Fraga <fragabr@...il.com>,
Sasha Levin <sasha.levin@...cle.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: frequent lockups in 3.18rc4
On Fri, Dec 5, 2014 at 1:38 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Fri, Dec 5, 2014 at 9:15 AM, Dave Jones <davej@...hat.com> wrote:
>>
>> A bisect later, and I landed on a kernel that ran for a day, before
>> spewing NMI messages, recovering, and then..
>>
>>
>> https://urldefense.proofpoint.com/v1/url?u=http://codemonkey.org.uk/junk/log.txt&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=APfD8%2BRkGVsO9UHnH6Oo05Zuoh90VyaaF71AycsnLbQ%3D%0A&s=de71b34f3a7da1c7b8f12dcd760c271657f9f7e2a93b4d2e296b2c687cee5157
>
> I have to admit I'm seeing absolutely nothing sensible in there.
>
> Call it bad, and see if bisection ends up slowly -oh so slowly -
> pointing to some direction. Because I don't think it's the hardware,
> considering that apparently 3.16 is solid. And the spews themselves
> are so incomprehensible that I'm not seeing any pattern what-so-ever.
I went back through all of the traces Dave has posted in this thread.
This one looks like vm debugging is on:
http://marc.info/?l=linux-kernel&m=141632237304726&w=2
Another had a function call from CONFIG_DEBUG_PAGEALLOC:
http://marc.info/?l=linux-kernel&m=141701248210949&w=2
So one idea is that our allocation/freeing of pages is dramatically
more expensive and we're hitting a strange edge condition. Maybe we're
even faulting on a readonly page from a horrible place?
[83246.925234] end_request: I/O error, dev sda, sector 0
Ext3/4 shouldn't be doing IO to sector zero. Something is stomping on
ram?
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists