lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 4 Dec 2014 15:23:30 +0100
From:	Jiri Olsa <jolsa@...hat.com>
To:	kan.liang@...el.com
Cc:	acme@...nel.org, a.p.zijlstra@...llo.nl, eranian@...gle.com,
	linux-kernel@...r.kernel.org, mingo@...hat.com, paulus@...ba.org,
	ak@...ux.intel.com, namhyung@...nel.org
Subject: Re: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

On Tue, Dec 02, 2014 at 10:06:51AM -0500, kan.liang@...el.com wrote:
> From: Kan Liang <kan.liang@...el.com>
> 
> This is the user space patch for Haswell LBR call stack support.
> For many profiling tasks we need the callgraph. For example we often
> need to see the caller of a lock or the caller of a memcpy or other
> library function to actually tune the program. Frame pointer unwinding
> is efficient and works well. But frame pointers are off by default on
> 64bit code (and on modern 32bit gccs), so there are many binaries around
> that do not use frame pointers. Profiling unchanged production code is
> very useful in practice. On some CPUs frame pointer also has a high
> cost. Dwarf2 unwinding also does not always work and is extremely slow
> (upto 20% overhead).
> 
> Haswell has a new feature that utilizes the existing Last Branch Record
> facility to record call chains. When the feature is enabled, function
> call will be collected as normal, but as return instructions are
> executed the last captured branch record is popped from the on-chip LBR
> registers. The LBR call stack facility provides an alternative to get
> callgraph. It has some limitations too, but should work in most cases
> and is significantly faster than dwarf. Frame pointer unwinding is still
> the best default, but LBR call stack is a good alternative when nothing
> else works.
> 
> Please find the kernel part patch at https://lkml.org/lkml/2014/11/6/432
> 
> Changes since v1
>  - Update help document
>  - Force exclude_user to 0 with warning in LBR call stack
>  - Dump both lbr and fp info when report -D
>  - Reconstruct thread__resolve_callchain_sample and split it into two patches
>  - Use has_branch_callstack function to check LBR call stack available
> 
> Changes since v2
>  - Rebase to 025ce5d33373
> 
> Changes since v3
>  - Rebase to cc502c23aadf
>  - Separated function for lbr call stack sample resolve and print
>  - Some minor changes according to comments
> 
> Changes since V4
>  - Rebase to 09a6a1b
>  - Falling back to framepointers if LBR not available, and warning user

looks ok to me..

I'll test it once I get hands on Haswel server again, I guess we
wait for the kernel change to go in first anyway, right?

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ