[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161215171008.GX3124@twins.programming.kicks-ass.net>
Date: Thu, 15 Dec 2016 18:10:08 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Stephane Eranian <eranian@...gle.com>
Cc: Jiri Olsa <jolsa@...hat.com>, Andi Kleen <ak@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
"mingo@...e.hu" <mingo@...e.hu>,
"Liang, Kan" <kan.liang@...el.com>,
Namhyung Kim <namhyung@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH 2/3] perf/x86/pebs: add workaround for broken OVFL status
on HSW
On Thu, Dec 15, 2016 at 08:59:56AM -0800, Stephane Eranian wrote:
> On Thu, Dec 15, 2016 at 12:42 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Wed, Dec 14, 2016 at 11:26:49PM -0800, Stephane Eranian wrote:
> >> On Wed, Dec 14, 2016 at 9:55 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> >> >
> >> > Just spotted this again, ping?
> >> >
> >> Ok, on what processor running what command, so I can try and reproduce?
> >
> > For me its more of a correctness issue, i've not actually spotted a
> > problem as such.
> >
> > But every time I read this code it makes me wonder.
> >
> > Supposing that the hardware sets the CTRL overflow flags but hasn't
> > generated the PEBS record yet (or not enough records to reach the PEBS
> > buffer threshold) we still don't want to process these events as if they
> > were !PEBS.
> >
> I am suspicious about the case where you have multiple PEBS events and
> they do not quite fire at the same time but close enough that you may have
> PEBS in-flight by the time you enter handle_irq.
>
> Last night I ran a simple test on SKL using tip.git:
>
> $ perf record --e
> cpu/event=0xd0,umask=0x81/upp,cpu/event=0xc0,umask=1/upp,cpu/event=0xd0,umask=0x81/upp
> multichase; perf report -D | fgrep SAMPLE | grep -v 'IP, 0x4' | grep
> -v events
>
> Basically, looking for samples missing the EXACT tag, i.e., samples
> processed a regular event when I only have PEBS events. Over 8h, I got
> about 3 or 4 such samples. So there is still a condition where we see
> the overflow as regular and not PEBS. So we need to examine that code
> again looking for possible race with PEBS in flight and not having the
> PEBS overflow bits yet.
Isn't that exactly the case I was talking about? and would be avoided by
the proposed patch?
So semantically the counter overflows and then arms PEBS to record a
record on the next event once its armed (and this can be multiple events
after the overflow, since arming takes a while too).
Now, if the chip manages to raise the regular overflow bit during that
time, you get exactly what is described.
meaning we should unconditionally clear the pebs_enabled.
Powered by blists - more mailing lists