linux-kernel - Re: [PATCH] ftrace based hard lockup detector

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090119132428.GA23299@elte.hu>
Date:	Mon, 19 Jan 2009 14:24:28 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ftrace based hard lockup detector


* Steven Rostedt <rostedt@...dmis.org> wrote:

> 
> On Mon, 19 Jan 2009, Ingo Molnar wrote:
> 
> > 
> > * Steven Rostedt <rostedt@...dmis.org> wrote:
> > 
> > > On Sun, 18 Jan 2009, Frederic Weisbecker wrote:
> > > 
> > > > Like the NMI watchdog, this feature try to detect hard lockups by
> > > > lurking at the non-progress of the timer interrupts.
> > > > 
> > > > You can enable it at boot time by passing the ftrace_hardlockup parameter.
> > > > I plan to add a debugfs file to enable/disable at runtime.
> > > > 
> > > > When a hardlockup is detected, it will print a backtrace. Perhaps it
> > > > would be good to print the locks held from lockdep too?
> > > > 
> > > > It only support x86 for the moment, because a kind of generic timer interrupt
> > > > counter is needed on all archs to have it generic.
> > > > 
> > > > Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
> > > 
> > > Hi Frederic,
> > > 
> > > This seems like a rewrite of the NMI lockup code. In my debugging, I 
> > > simply put ftrace_dump in the NMI lockup, which gives me a ftrace dump 
> > > as soon as NMI detects a lockup. I'm a bit confused at what this gives 
> > > us over that?
> > 
> > this is different from the NMI watchdog in a number of ways:
> > 
> >  - it works on all platforms and in all situations where the NMI watchdog 
> >    does not work.
> > 
> >  - in theory it can detect hard lockups in situations where the NMI 
> >    watchdog is disabled, such as suspend/resume or early bootup. 
> >    (especially early bootup lockups are nasty and the NMI watchdog is 
> >     enabled relatively late)
> > 
> >  - it could be extended to detect 'soft' lockups too - i.e. we could have 
> >    a one-stop facility to detect all kinds of "kernel does not seem to 
> >    progress" lockups.
> > 
> > But it's not as complete as the NMI watchdog: it relies on instrumented 
> > function calls rolling on and on during the lockup - that's not the case 
> > when we get a hard lockup due to a tight, infinite loop somewhere.
> 
> Ah, OK, the check is in the function tracer. Hmm, my logdev code had an 
> option to enable tracing at early bootup. Instead of using the normal 
> memory alloction for the ring buffer, it needed to use alloc_bootmem. I 
> wonder if it would be worth it to allow for a tracer to do the same if 
> it needs to be allocated early on (before memory is initialized)?

Yes, very much so!

Especially when using things like dump_trace options, this would be handy.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/