[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20080905104952.5e9ea394.akpm@linux-foundation.org>
Date: Fri, 5 Sep 2008 10:49:52 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: Thomas Gleixner <tglx@...utronix.de>,
torvalds@...ux-foundation.org, sfr@...b.auug.org.au,
linux-next@...r.kernel.org, linux-kernel@...r.kernel.org,
yhlu.kernel@...il.com, ink@...assic.park.msu.ru,
jbarnes@...tuousgeek.org, netdev@...r.kernel.org,
viro@...iv.linux.org.uk, ebiederm@...ssion.com,
dwmw2@...radead.org, sam@...nborg.org, johnstul@...ibm.com
Subject: Re: linux-next: Tree for September 3
On Fri, 5 Sep 2008 13:04:11 +0200 Ingo Molnar <mingo@...e.hu> wrote:
>
> * Thomas Gleixner <tglx@...utronix.de> wrote:
>
> > On Thu, 4 Sep 2008, Andrew Morton wrote:
> > > >
> > > > Cute, NULL pointer in the timer check code. Can you please addr2line
> > > > the exact code line or upload the vmlinux somewhere ?
> > > >
> > >
> > > erm, I might have lost that binary, and it only happened the once. It
> > > happened shortly after the machine had fully booted, during
> > > establishment of the first sshd session.
> > >
> > > It nuked the machine really well, too. I had to pull the battery to
> > > get it back.
> >
> > Known problem on Sonys. :(
> >
> > > fwiw:
> > >
> > > (gdb) l *0xc0126e7f
> > > 0xc0126e7f is in get_next_timer_interrupt (kernel/timer.c:863).
> > > warning: Source file is more recent than executable.
> > > 858 for (array = 0; array < 4; array++) {
> > > 859 struct tvec *varp = varray[array];
> > > 860
> > > 861 index = slot = timer_jiffies & TVN_MASK;
> > > 862 do {
> > > 863 list_for_each_entry(nte, varp->vec + slot, entry) {
> > > 864 found = 1;
> > > 865 if (time_before(nte->expires, expires))
> > > 866 expires = nte->expires;
> > > 867 }
> > >
> > > which looks reasonable.
> >
> > Yeah, as Linus decoded it's that loop. So we look at some corrupted
> > entry here.
> >
> > CONFIG_DEBUG_OBJECTS (add debug_objects to the command line as well)
> > should catch it when this is a timer being discarded, freed or
> > reinitialized.
> >
> > Otherwise, when it is just random corruption it wont help much.
>
> i guess CONFIG_DEBUG_OBJECTS_TIMERS=y is practical, and
> CONFIG_DEBUG_LIST=y would be nice as well - it can catch memory
> corruptions rather early and is relatively light-weight.
I tested rc5-mm1 with all debug options except PAGEALLOC. No help.
> [ and if there's any reproducability of the corruption and if it happens
> at a stable kernel address then a small custom hack in ftrace can
> catch it the moment it happens. ]
It was a once-off.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists