lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 May 2015 09:52:28 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Josh Poimboeuf <jpoimboe@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Michal Marek <mmarek@...e.cz>,
	Peter Zijlstra <peterz@...radead.org>, X86 ML <x86@...nel.org>,
	live-patching@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andy Lutomirski <luto@...nel.org>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Brian Gerst <brgerst@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Borislav Petkov <bp@...en8.de>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v4 0/3] Compile-time stack frame pointer validation


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Wed, May 20, 2015 at 9:25 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > On Wed, May 20, 2015 at 09:03:37AM -0700, Andy Lutomirski wrote:
> >>
> >> I've never quite understood what the '?' means.
> >
> > It basically means "here's a function address we found on the 
> > stack, which may or may not have been called."  It's needed 
> > because stack walking isn't currently 100% reliable.
> 
> It is often quite interesting and helpful, because it shows stale 
> data on the stack, giving clues about what happened just before.

Yes, it's basically a zero-cost tracer: often showing a partial trace 
of events that happened before.

> Now, I'd like gcc to generally be better about not wasting so much 
> stack frame, so in that sense I'd like to see fewer '?" entries just 
> from a code quality standpoint, but when debugging those things, the 
> downside of "noise" is often cancelled by the upside of "ahh, it 
> happens after calling X".
> 
> So the "perfect stack frames" is actually not as great a thing as 
> some people want to make it seem.

We should definitely also print out the '?' entries, they are very 
useful especially when analyzing rare, difficult to reproduce, 
sporadic bugs - which are usually the hardest to fix bugs.

The biggest long term plus of 'perfect stack frames' would not be to 
skip the '?' entries (we don't want to skip them!), but to be able to 
eventually build the kernel without frame pointers.

Especially on modern x86 CPUs with stack engines (latest Intel and AMD 
CPUs) that keeps ESP updates out of the later stages of execution 
pipelines, going from RBP framepointers to direct ESP use is 
beneficial to performance and compresses I$ footprint as well:

    text           data     bss      dec            hex filename
12150606        2565544 1634304 16350454         f97cf6 linux-CONFIG_FRAME_POINTERS=n/vmlinux
13282884        2571744 1617920 17472548        10a9c24 linux-CONFIG_FRAME_POINTERS=y/vmlinux

Here's the I$ cachemiss rate with the 'vfs-mix' workload that I used 
in the -falign-functions measuremenst gives this for 
CONFIG_FRAMEPOINTERS=y, on Intel Sandy Bridge (best of 9x10 runs):

 #
 # CONFIG_FRAMEPOINTERS=y
 #
 Performance counter stats for 'system wide' (10 runs):

       728,328,347      L1-icache-load-misses                                         ( +-  0.08% )  (100.00%)
    11,891,931,664      instructions                                                  ( +-  0.00% )
           300,023      context-switches                                              ( +-  0.00% )

       7.324048170 seconds time elapsed                                          ( +-  0.09% )

... and these are the I$ miss perf stats from running the same 
workload on a CONFIG_FRAMEPOINTERS=n kernel:

 #
 # CONFIG_FRAMEPOINTERS are not set
 #
 Performance counter stats for 'system wide' (10 runs):

       687,758,078      L1-icache-load-misses                                         ( +-  0.10% )  (100.00%)
    10,984,908,013      instructions                                                  ( +-  0.01% )
           300,021      context-switches                                              ( +-  0.00% )

       7.120867260 seconds time elapsed                                          ( +-  0.29% )

So if we disable frame pointers, then on this workload:

  - the kernel text size is 9.3% smaller
  - the number of instructions executed went down by about 8.2%
  - the cachemiss rate went down by about 5.9%
  - performance went up by about 2.8%.

The speedup is actually even better than 2.8%, if you look at average 
execution time:

linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.324048170 seconds time elapsed                                          ( +-  0.09% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.470166715 seconds time elapsed                                          ( +-  1.01% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.365047474 seconds time elapsed                                          ( +-  0.25% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.828223324 seconds time elapsed                                          ( +-  2.04% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.427164489 seconds time elapsed                                          ( +-  0.70% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.385565350 seconds time elapsed                                          ( +-  0.35% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.560782318 seconds time elapsed                                          ( +-  1.68% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.399741309 seconds time elapsed                                          ( +-  0.74% )
linux-CONFIG_FRAME_POINTERS=y/res.txt:       7.303746766 seconds time elapsed                                          ( +-  0.04% )

 avg = 7.451609

linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.201498813 seconds time elapsed                                          ( +-  0.86% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.120867260 seconds time elapsed                                          ( +-  0.29% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.141642635 seconds time elapsed                                          ( +-  0.15% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.217213506 seconds time elapsed                                          ( +-  0.85% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.163046581 seconds time elapsed                                          ( +-  0.56% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.128939439 seconds time elapsed                                          ( +-  0.23% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.256172853 seconds time elapsed                                          ( +-  0.82% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.122946768 seconds time elapsed                                          ( +-  0.23% )
linux-CONFIG_FRAME_POINTERS=n/res.txt:       7.126018578 seconds time elapsed                                          ( +-  0.18% )

 avg = 7.164260

Then with framepointers disabled this workload gets faster by 4.0% on 
average.

The average result is also pretty stable in the no-framepointers case, 
while it fluctuates more in the framepointers case. (and this is why 
the 'best runtime' favors the framepointers case - the average is 
closer to reality.)

So the performance advantages of not doing framepointers is not 
something we can ignore IMHO: but obviously performance isn't 
everything - so if stack unwinding is unrobust, then we need and
want frame pointers.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ