[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1411182023360.3909@nanos>
Date: Tue, 18 Nov 2014 20:28:01 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Dave Jones <davej@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Don Zickus <dzickus@...hat.com>
Subject: Re: frequent lockups in 3.18rc4
On Tue, 18 Nov 2014, Linus Torvalds wrote:
> On Tue, Nov 18, 2014 at 6:52 AM, Dave Jones <davej@...hat.com> wrote:
> >
> > Here's the first hit. Curiously, one cpu is missing.
>
> That might be the CPU3 that isn't responding to IPIs due to some bug..
>
> > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [trinity-c180:17837]
> > RIP: 0010:[<ffffffffa91a0db0>] [<ffffffffa91a0db0>] bad_range+0x0/0x90
>
> Hmm. Something looping in the page allocator? Not waiting for a lock,
> but livelocked? I'm not seeing anything here that should trigger the
> NMI watchdog at all.
>
> Can the NMI watchdog get confused somehow?
That's the soft lockup detector which runs from the timer interrupt
not from NMI.
> So it does look like CPU3 is the problem, but sadly, CPU3 is
> apparently not listening, and doesn't even react to the NMI, much less
As I said in the other mail. It gets the NMI and reacts on it. It's
just mangled into the CPU0 backtrace.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists