[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120210201850.GA26892@m.redhat.com>
Date: Fri, 10 Feb 2012 21:18:50 +0100
From: Jiri Olsa <jolsa@...hat.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Arnaldo Carvalho de Melo <acme@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>, paulus@...ba.org,
cjashfor@...ux.vnet.ibm.com, fweisbec@...il.com,
linux-kernel@...r.kernel.org,
"James E.J. Bottomley" <jejb@...isc-linux.org>,
Jan Blunck <jblunck@...e.de>
Subject: Re: [RFC 0/5] kernel: backtrace unwind support
On Fri, Feb 10, 2012 at 08:44:26PM +0100, Ingo Molnar wrote:
>
> * Arnaldo Carvalho de Melo <acme@...hat.com> wrote:
>
> > Em Fri, Feb 10, 2012 at 10:59:51AM -0800, Linus Torvalds escreveu:
> > > On Fri, Feb 10, 2012 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > > >
> > > > So I CC'ed Linus who has a strong here, jejb since he's the one that
> > > > told me several time there's a number of literate dwarfs already in the
> > > > kernel and Jan because I think it was him that tried last on x86.
> > >
> > > I never *ever* want to see this code ever again.
> > >
> > > Sorry, but last time was too f*cking painful. The whole (and *only*)
> > > point of unwinders is to make debugging easy when a bug occurs. But
> > > the f*cking dwarf unwinder had bugs itself, or our dwarf information
> > > had bugs, and in either case it actually turned several "trivial" bugs
> > > into a total undebuggable hell.
> > >
> > > It was made doubly painful by the developers involved then several
> > > times ignoring the problem, and claiming the code was bug-free when it
> > > clearly wasn't, or trying to claim that the problem was that we set up
> > > some random dwarf information wrong, when THAT GOES WITHOUT SAYING
> > > (since dwarf is a complex mess that never gets any actual testing
> > > except when things go wrong - at which point the code had better work
> > > regardless of whether the dwarf info was correct or not).
> > >
> > > So no. An unwinder that is several hundred lines long is simply not
> > > even *remotely* interesting to me.
> > >
> > > If you can mathematically prove that the unwinder is correct - even in
> > > the presence of bogus and actively incorrect unwinding information -
> > > and never ever follows a bad pointer, I'll reconsider.
> > >
> > > In the absence of that, just follow the damn chain on the stack
> > > *without* the "smarts" of an inevitably buggy piece of crap.
> >
> > "Vote for --fno-omit-frame-pointer! One register is a cheap
> > price to pay for not going insane!"
> >
> > /me goes back to non political things.
>
> Well, instead of dropping it we could try to meet Linus's
> challenge, at least to a fair degree.
>
> Also lets fundamentally treat GCC provided data as untrusted,
> hostile data and lets put lockdep-alike redundancy and resilence
> around it.
>
> As a first step lets try input randomization unit tests. A lot
> of the broken unwind code was really just sloppy about boundary
> conditions.
right, looks like crucial part.. :)
>
> I had a quick peek and I don't think it's constructed in a
> resilent enough form right now. For example there's no clear
> separation and checking of what comes from GCC and what not.
yes, there's nothing like this in now,
I'll see what can be done about that..
>
> It *can* be done: lockdep is not hundreds but thousands of lines
> of highly complex code (with non-trivial algorithms such as
> graph walks), and still it has a very good track record - so
> it's possible.
>
> Once that is done I'd like to try it myself in practice, without
> offering it as a pull to Linus. I see a *lot* of weird oopses
> all day in and out, often in impossible contexts, and the old
> dwarf unwinder was crap.
>
> I'd also love to see perf callchains work on all kernels and
> extend into user-space as well, if that's possible in a sane
> fashion. 90% of the interesting apps out there are build with
> framepointers off, and the context of overhead is often rather
> obscure. Looking at good callchains is a good learning
> experience all around.
>
> So it's not *entirely* crazy IMO, lets iterate this please.
> Jiri, are you still interested in it?
yep, looks interesting.. not sure about the mathematical proof though ;)
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists