[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1505011316420.2300@vincent-weaver-1.umelst.maine.edu>
Date: Fri, 1 May 2015 13:20:17 -0400 (EDT)
From: Vince Weaver <vincent.weaver@...ne.edu>
To: Ingo Molnar <mingo@...nel.org>
cc: Vince Weaver <vincent.weaver@...ne.edu>,
linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Paul Mackerras <paulus@...ba.org>
Subject: Re: perf: WARNING perfevents: irq loop stuck!
On Fri, 1 May 2015, Ingo Molnar wrote:
>
> * Vince Weaver <vincent.weaver@...ne.edu> wrote:
>
> > So this is just a warning, and I've reported it before, but the
> > perf_fuzzer triggers this fairly regularly on my Haswell system.
> >
> > It looks like fixed counter 0 (retired instructions) being set to
> > 0000fffffffffffe occasionally causes an irq loop storm and gets
> > stuck until the PMU state is cleared.
>
> So 0000fffffffffffe corresponds to 2 events left until overflow,
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we
> allow these super short periods.
>
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well,
> one similar to bdw_limit_period()? Something like the patch below?
I spent the morning trying to get a reproducer for this. It turns out to
be complex. It seems in addition to fixed counter 0 being set to -2, at
least one other non-fixed counter must be about to overflow.
For example, in this case gen-PMC2 is also poised to overflow at the same
time.
CPU#0: gen-PMC2 ctrl: 00000003ff96764b
CPU#0: gen-PMC2 count: 0000000000000001
gen-PMC2 left: 0000ffffffffffff
...
[ 2408.612442] CPU#0: fixed-PMC0 count: 0000fffffffffffe
It's not always PMC2 but in the warnings there's at least one other
gen-PMC about to overflow at the exact same time as the fixed one.
Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists