[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F0770167D124@shsmsx102.ccr.corp.intel.com>
Date: Thu, 4 Dec 2014 14:49:52 +0000
From: "Liang, Kan" <kan.liang@...el.com>
To: Jiri Olsa <jolsa@...hat.com>,
"a.p.zijlstra@...llo.nl" <a.p.zijlstra@...llo.nl>
CC: "acme@...nel.org" <acme@...nel.org>,
"a.p.zijlstra@...llo.nl" <a.p.zijlstra@...llo.nl>,
"eranian@...gle.com" <eranian@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"paulus@...ba.org" <paulus@...ba.org>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"namhyung@...nel.org" <namhyung@...nel.org>
Subject: RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)
> On Tue, Dec 02, 2014 at 10:06:51AM -0500, kan.liang@...el.com wrote:
> > From: Kan Liang <kan.liang@...el.com>
> >
> > This is the user space patch for Haswell LBR call stack support.
> > For many profiling tasks we need the callgraph. For example we often
> > need to see the caller of a lock or the caller of a memcpy or other
> > library function to actually tune the program. Frame pointer unwinding
> > is efficient and works well. But frame pointers are off by default on
> > 64bit code (and on modern 32bit gccs), so there are many binaries
> > around that do not use frame pointers. Profiling unchanged production
> > code is very useful in practice. On some CPUs frame pointer also has a
> > high cost. Dwarf2 unwinding also does not always work and is extremely
> > slow (upto 20% overhead).
> >
> > Haswell has a new feature that utilizes the existing Last Branch
> > Record facility to record call chains. When the feature is enabled,
> > function call will be collected as normal, but as return instructions
> > are executed the last captured branch record is popped from the
> > on-chip LBR registers. The LBR call stack facility provides an
> > alternative to get callgraph. It has some limitations too, but should
> > work in most cases and is significantly faster than dwarf. Frame
> > pointer unwinding is still the best default, but LBR call stack is a
> > good alternative when nothing else works.
> >
> > Please find the kernel part patch at
> > https://lkml.org/lkml/2014/11/6/432
> >
> > Changes since v1
> > - Update help document
> > - Force exclude_user to 0 with warning in LBR call stack
> > - Dump both lbr and fp info when report -D
> > - Reconstruct thread__resolve_callchain_sample and split it into two
> > patches
> > - Use has_branch_callstack function to check LBR call stack available
> >
> > Changes since v2
> > - Rebase to 025ce5d33373
> >
> > Changes since v3
> > - Rebase to cc502c23aadf
> > - Separated function for lbr call stack sample resolve and print
> > - Some minor changes according to comments
> >
> > Changes since V4
> > - Rebase to 09a6a1b
> > - Falling back to framepointers if LBR not available, and warning
> > user
>
> looks ok to me..
>
Thanks for the review.
> I'll test it once I get hands on Haswel server again, I guess we wait for the
> kernel change to go in first anyway, right?
>
I'm not sure, let's ask Peter.
Peter?
Thanks,
Kan
> thanks,
> jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists