[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1103110035370.2787@localhost6.localdomain6>
Date: Fri, 11 Mar 2011 00:46:40 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Andrew Morton <akpm@...ux-foundation.org>
cc: Phil Carmody <ext-phil.2.carmody@...ia.com>, gregkh@...e.de,
linux-kernel@...r.kernel.org
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump
On Fri, 11 Mar 2011, Thomas Gleixner wrote:
> On Thu, 10 Mar 2011, Andrew Morton wrote:
> > On Fri, 11 Mar 2011 00:13:58 +0100 (CET)
> > Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > > > > It's more of an distraction than anything which is relevant to 99.999%
> > > > > of the problems we have to deal with.
> > > >
> > > > As I indicated before, I've previously thought that too, but thought I
> > > > could 'fix' it by adding to it when I hit the once-in-three-years case.
> > >
> > > The interesting question is:
> > >
> > > How did that info help and was it really the ultimate reason why you
> > > found the underlying bug ?
> >
> > What happens with sysfs is that if a subsystem's handler is buggy, that
> > tends to cause a crash within sysfs core code. You get a stack trace
> > which contains only VFS and sysfs functions - there is no symbol in the
> > trace which permits you to identify the offending subsystem.
>
> Reminds me of timer bugs, which popped up way after the fact that some
> stupid driver reinitialized and active timer or freed memory
> containing an active driver.
Gah: s/active driver/active timer/
> For some obvious reasons I haven't seen any of those bugs wasting my
> time other than asking the bug reporter to enable debugobjects. :)
That said, we really want better debug facilities which are not
cluttering the basic debug output with totally irrelevant information.
Following your reasoning we should record the last accessed file in
general, plus the last ioctl and whatever we think might be relevant
to decode random bugs easier. That's not going to fly.
The main problemns are object life time rules or missing function
pointers in the first place. Both can be tackled by other means than
adding random information to the back trace.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists