[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170713044038.GE3044@two.firstfloor.org>
Date: Wed, 12 Jul 2017 21:40:38 -0700
From: Andi Kleen <andi@...stfloor.org>
To: Mike Galbraith <efault@....de>
Cc: Andi Kleen <andi@...stfloor.org>,
Josh Poimboeuf <jpoimboe@...hat.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, live-patching@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>, Jiri Slaby <jslaby@...e.cz>,
Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)
On Thu, Jul 13, 2017 at 06:28:43AM +0200, Mike Galbraith wrote:
> On Wed, 2017-07-12 at 21:15 -0700, Andi Kleen wrote:
> > On Thu, Jul 13, 2017 at 05:03:00AM +0200, Mike Galbraith wrote:
> > > On Wed, 2017-07-12 at 15:30 -0700, Andi Kleen wrote:
> > > > Josh Poimboeuf <jpoimboe@...hat.com> writes:
> > > > >
> > > > > The ORC data format does have a few downsides compared to DWARF. The
> > > > > ORC unwind tables take up ~1MB more memory than DWARF eh_frame tables.
> > > > >
> > > > Can we have an option to just use dwarf instead? For people
> > > > who don't want to waste a MB+ to solve a problem that doesn't
> > > > exist (as proven by many years of opensuse kernel experience)
> > >
> > > Sure the dwarf unwinder works well for crashes, but at the price of
> > > demolishing ftrace/perf utility.
> >
> > You mean the unwind performance?
>
> Yeah, it hurts.. massively, has even been known to kill big boxen.
Why was that?
>
> > That's a valid concern, but neither ORC nor dwarf are likely
> > to address it. However most usages of ftrace/perf shouldn't be that
> > depending on unwind performance -- just lower the frequency of your
> > events.
> >
> > The only possible win is if the win from not using FP code is
> > significant enough. On the x86 side the only modern CPUs that should really
> > care about this are Atoms.
>
> Nope, they all care. Measure performance delta of fast/light stuff.
Well if your test cares that much about function overhead you may want to try
LTO. It can get rid of a lot of functions by doing cross file
inlining.
https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=lto-411-2
> Maybe I'm expecting too much good stuff to follow, but don't spoil it
> for me, I think I'm looking at a real winner :)
It's somewhat surprising. It would be good to under stand why that
happens. Is it icache misses, data cache misses for the stack, or
simply more instructions executed, or worse tail calls?
-Andi
Powered by blists - more mailing lists