linux-kernel - Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151013134004.GA8843@gmail.com>
Date:	Tue, 13 Oct 2015 15:40:04 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	linux-kernel@...r.kernel.org, acme@...hat.com,
	peterz@...radead.org, mingo@...e.hu, ak@...ux.intel.com,
	jolsa@...hat.com, namhyung@...nel.org, khandual@...ux.vnet.ibm.com
Subject: Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL


* Stephane Eranian <eranian@...gle.com> wrote:

> This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
> for Intel x86 processors. When the processor support LBR filtering
> this the selection is done in hardware. Otherwise, the filter is
> applied by software. Note that we chose to include zero length calls
> because they also represent calls.
> 
> Signed-off-by: Stephane Eranian <eranian@...gle.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> index ad0b8b0..bfd0b71 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
>  	if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
>  		mask |= X86_BR_IND_JMP;
>  
> +	if (br_type & PERF_SAMPLE_BRANCH_CALL)
> +		mask |= X86_BR_CALL | X86_BR_ZERO_CALL;

I'm wondering how frequent zero-length calls are. If they still occur in typical 
user-space, would it make sense to also have a separate branch sampling type for 
zero length calls?

Intel documents zero length calls as ones that (ab-)use the call instruction to 
push the current IP on the stack:

	call next_addr
next_addr:
	pop %reg

which can take over 10 cycles on certain microarchitectures (and it unbalances 
whatever call stack tracking/caching the CPU does as well).

So it might make sense to analyze them separately. I guess that's the reason why 
Intel added a separate flag for them in the PMU.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/