[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140815143121.GJ19379@twins.programming.kicks-ass.net>
Date: Fri, 15 Aug 2014 16:31:21 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Stephane Eranian <eranian@...gle.com>
Cc: Andi Kleen <ak@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Andi Kleen <andi@...stfloor.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH 4/5] perf, x86: Add INST_RETIRED.ALL workarounds
On Thu, Aug 14, 2014 at 07:47:56PM +0200, Stephane Eranian wrote:
> [+perf tool maintainers]
>
> On Thu, Aug 14, 2014 at 4:30 PM, Andi Kleen <ak@...ux.intel.com> wrote:
> >
> > I understand all your points, but there's no alternative.
> > The only other way would be to disable INST_RETIRED.ALL.
> >
> You cannot do that either. INST_RETIRED:ALL is important. I assume
> the bug applies whether or not the event is used with a filter.
>
> I think we need to ensure that by looking at the perf.data file, one
> can reconstruct the total number of inst_Retired:all occurrences for
> the run. With a fixed period, one would do num_samples * fixed_period.
> I know the Gooda tool does that. It is used to estimate the number of
> events captured vs. the number of events occurring.
OK, I think we can make that work; IFF we guarantee
perf_event_attr::sample_period >= 128.
Suppose we start out with sample_period=192; then we'll set period_left
to 192, we'll end up with left = 128 (we truncate the lower bits). We
get an interrupt, find that period_left = 64 (>0 so we return 0 and
don't get an overflow handler), up that to 128. Then we trigger again,
at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
an overflow). We increment with sample_period so we get left = 128. We
fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
overflow). And on and on.
So while the individual interrupts are 'wrong' we get then with
interval=256,128 in exactly the right ratio to average out at 192. And
this works for everything >=128.
So the num_samples*fixed_period thing is still entirely correct +- 127,
which is good enough I'd say, as you already have that error anyhow.
So no need to 'fix' the tools, al we need to do is refuse to create
INST_RETIRED:ALL events with sample_period < 128.
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists