[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090119132428.GA23299@elte.hu>
Date: Mon, 19 Jan 2009 14:24:28 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ftrace based hard lockup detector
* Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Mon, 19 Jan 2009, Ingo Molnar wrote:
>
> >
> > * Steven Rostedt <rostedt@...dmis.org> wrote:
> >
> > > On Sun, 18 Jan 2009, Frederic Weisbecker wrote:
> > >
> > > > Like the NMI watchdog, this feature try to detect hard lockups by
> > > > lurking at the non-progress of the timer interrupts.
> > > >
> > > > You can enable it at boot time by passing the ftrace_hardlockup parameter.
> > > > I plan to add a debugfs file to enable/disable at runtime.
> > > >
> > > > When a hardlockup is detected, it will print a backtrace. Perhaps it
> > > > would be good to print the locks held from lockdep too?
> > > >
> > > > It only support x86 for the moment, because a kind of generic timer interrupt
> > > > counter is needed on all archs to have it generic.
> > > >
> > > > Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
> > >
> > > Hi Frederic,
> > >
> > > This seems like a rewrite of the NMI lockup code. In my debugging, I
> > > simply put ftrace_dump in the NMI lockup, which gives me a ftrace dump
> > > as soon as NMI detects a lockup. I'm a bit confused at what this gives
> > > us over that?
> >
> > this is different from the NMI watchdog in a number of ways:
> >
> > - it works on all platforms and in all situations where the NMI watchdog
> > does not work.
> >
> > - in theory it can detect hard lockups in situations where the NMI
> > watchdog is disabled, such as suspend/resume or early bootup.
> > (especially early bootup lockups are nasty and the NMI watchdog is
> > enabled relatively late)
> >
> > - it could be extended to detect 'soft' lockups too - i.e. we could have
> > a one-stop facility to detect all kinds of "kernel does not seem to
> > progress" lockups.
> >
> > But it's not as complete as the NMI watchdog: it relies on instrumented
> > function calls rolling on and on during the lockup - that's not the case
> > when we get a hard lockup due to a tight, infinite loop somewhere.
>
> Ah, OK, the check is in the function tracer. Hmm, my logdev code had an
> option to enable tracing at early bootup. Instead of using the normal
> memory alloction for the ring buffer, it needed to use alloc_bootmem. I
> wonder if it would be worth it to allow for a tracer to do the same if
> it needs to be allocated early on (before memory is initialized)?
Yes, very much so!
Especially when using things like dump_trace options, this would be handy.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists