[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120427151852.GH18810@erda.amd.com>
Date:	Fri, 27 Apr 2012 17:18:52 +0200
From:	Robert Richter <robert.richter@....com>
To:	Stephane Eranian <eranian@...gle.com>
CC:	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 06/12] perf/x86-ibs: Precise event sampling with IBS for
 AMD CPUs
On 27.04.12 15:10:22, Stephane Eranian wrote:
> perf record -a -e cpu-cycles:p ...    # use ibs op counting cycle count
> perf record -a -e r076:p ...          # same as -e cpu-cycles:p
> perf record -a -e r0C1:p ...          # use ibs op counting micro-ops
> 
> Each IBS sample contains a linear address that points to the
> instruction that was causing the sample to trigger. With ibs we have
> skid 0.
> 
> Though the skid is 0, we map IBS sampling to following precise levels:
> 
>  1: RIP taken from IBS sample or (if invalid) from stack.
> 
> I assume by stack you mean pt_regs, right?
Right.
> 
> 2: RIP always taken from IBS sample, samples with an invalid rip
>    are dropped. Thus samples of an event containing two precise
>    modifiers (e.g. r076:pp) only contain (precise) addresses
>    detected with IBS.
> 
> I don't think you need the distinction between 1 and 2. You can
> always use the pt_regs IP as a fallback. You can mark that the
> IP is precise with the MISC_EXACT flag in the sample header.
> This is how it's done with PEBS. What's wrong with that?
> It may actually be better than dropping samples silently as it
> may introduce some bias.
There is nothing wrong with it. I already implemented that the
MISC_EXACT flag is supported. But, the flag is basically not used in
the perf tool and there is no modifier or so to only get a precise
rip.
Supose you want to use perf-annotate you only want to get precise
rips. With the levels suggested above you can do so with:
 perf record -a -e r076:pp ... | perf annotate ...
(Note the double-p.)
For non-biased sampling (e.g. counting or statistic numbers) you take
level 1 and you get every sample:
 perf record -a -e r076:p ...
There is the lack of a modifier to evaluate MISC_EXACT the same way.
That's why my choice of the levels above. Didn't have a better idea.
-Robert
-- 
Advanced Micro Devices, Inc.
Operating System Research Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
