[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150326092348.GA14706@gmail.com>
Date: Thu, 26 Mar 2015 10:23:48 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Jiri Kosina <jkosina@...e.cz>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Michal Marek <mmarek@...e.cz>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
live-patching@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/2] Compile-time stack frame pointer validation
* Jiri Kosina <jkosina@...e.cz> wrote:
> On Wed, 25 Mar 2015, Josh Poimboeuf wrote:
>
> > In discussions around my live kernel patching consistency model RFC [1],
> > Peter and Ingo correctly pointed out that stack traces aren't reliable.
> > And as Ingo said, there's no "strong force" which ensures we can rely on
> > them.
> >
> > So I've been thinking about how to fix that. My goal is to eventually
> > make stack traces reliable. Or at the very least, to be able to detect
> > at runtime when a given stack trace *might* be unreliable. But improved
> > stack traces would broadly benefit the entire kernel, regardless of the
> > outcome of the live kernel patching consistency model discussions.
> [ ... snip ... ]
>
> I haven't really gone through your patchset thoroughly yet, but I
> just wanted to make sure that you are aware of existing DWARF-based
> stack unwinder which exists for the kernel.
>
> It's not merged in mainline (one of the reasons being disagreements
> about bugfixes between Jan and Linus), [...]
That's putting it really mildly ...
The thing is, there's no chance in hell we can take any dwarf
annotation based stack dump patches to mainline without first changing
the fundamental dynamics of debug info quality: and I doubt we can do
that, given that we suffer the dwarf debug info created by external
tools which don't care much about occasionally incorrect debug info
because nothing depends on it functionally.
The beauty of framepointers is:
- framepointers they are simple
- and if tooling breaks them, then actual _C code_ breaks: arguments
on the stack are not accessible anymore and code crashes. That's a
heck of a strong incentive not to mess up framepointers...
The big problem with Dwarf stack annotations on the other hand is that
they are the 180 degress opposite of framepointers:
- dwarf annotations are complex
- if tooling breaks them, then no code breaks, nothing but debug
output breaks. C code still works just fine. This is a heck of a
weak incentive.
Having said all that, I like this particular patch-set because it IMHO
shows the right attitude towards debug info: it does not trust it and
does exactly the kind of checks that I think will improve stack trace
quality in the long run, even with the very simple model of
framepointers - for example inside assembly code.
> [...] but we've been carrying it in SUSE kernels as an out-of-tree
> patch for quite some time, and it really makes stack dumps much more
> reliable and understandable.
We have experience with dwarf debug info quality in tools/perf/ as
well, and "reliable" is not the word I'd use.
Here's a very quick grep of dwarf debug info related fixes in
tools/perf/:
4093325f8297 perf probe: Fix crash in dwarf_getcfi_elf
9b118acae310 perf probe: Fix to handle aliased symbols in glibc
b93b0967826a perf test: Fix dwarf unwind using libunwind.
c6e5e9fbc3ea perf tools: Fix building error in x86_64 when dwarf unwind is on
082f96a93eb5 perf probe: Fix perf probe to find correct variable DIE
9a126728165e perf tests x86: Fix stack map lookup in dwarf unwind test
b42dc32d4f91 perf tools: Fix dwarf unwind max_stack processing
e08cfd4bda76 perf probe: Fix to find line information for probe list
0dbb1cac1dbd perf probe: Fix finder to find lines of given function
a128405c6b40 perf probe: Fix line walker to check CU correctly
66a301c380d4 perf probe: Fix format specified for Dwarf_Off parameter
4046b8bb5ffa perf probe: Fix type searching
75ec5a245c77 perf probe: Fix to close dwarf when failing to analyze it
fc6ceea04503 perf probe: Fix need_dwarf flag if lazy matching is used
(I'm sure there are both false positives and false negatives in that
list)
and dwarf unwind isn't even the default for perf, you have to specify
a weird option to make use of it so most users don't even know it's
there.
> You can see it for example here:
>
> http://kernel.suse.com/cgit/kernel-source/tree/patches.suse/stack-unwind
>
> (and some merge attempt failures due to disagreements between Jan
> and Linus, not really completely related to the actual code itself,
> in LKML archives).
That again is putting it way too mildly IMHO ...
Summary: framepointers are fine, but dwarf based backtraces were
problematic in the past and we have trouble trusting them.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists