[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100929200323.GC26290@redhat.com>
Date: Wed, 29 Sep 2010 16:03:23 -0400
From: Don Zickus <dzickus@...hat.com>
To: Stephane Eranian <eranian@...gle.com>
Cc: Robert Richter <robert.richter@....com>,
Cyrill Gorcunov <gorcunov@...il.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"hpa@...or.com" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"yinghai@...nel.org" <yinghai@...nel.org>,
"andi@...stfloor.org" <andi@...stfloor.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"ying.huang@...el.com" <ying.huang@...el.com>,
"fweisbec@...il.com" <fweisbec@...il.com>,
"ming.m.lin@...el.com" <ming.m.lin@...el.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [tip:perf/urgent] perf, x86: Catch spurious interrupts after
disabling counters
On Wed, Sep 29, 2010 at 09:42:26PM +0200, Stephane Eranian wrote:
> On Wed, Sep 29, 2010 at 8:12 PM, Don Zickus <dzickus@...hat.com> wrote:
> > Robert,
> >
> > I think you missed Stephane's point. Say for example, kgdb is being used
> > while we are doing stuff with the perf counter (and say kgdb's handler is
> > a lower priority than perf; which isn't true I know, but let's say):
> >
> Yes, exactly my point. The reality is you cannot afford to have false positive
> because you may starve another subsystem from an important notification.
>
> I think it boils down to whether or not we need an error message (Dazed) in
> case no subsystem claimed the NMI. If you were to just silently consume the
> NMI when no subsystem claims it, then you would not have these issues.
>
> What Don has done is use a heuristic which gets activated when a PMU
> interrupt handler signals that more than one counter have overflowed. His
> claim is that this situation is likely to trigger back-to-back.
Actually its Robert's heuristic. :-)
>
> The reason this heuristic works is because it waits until ALL the subsystems
> have seen the notification before it declares that the NMI was PMU spurious.
> To do that is uses the DIE_NMI_UNKNOWN callchain. Handler on this chain
> get call last, after all subsystems have seen the notification once. I believe
> that is the only way to safely "consume" a "spurious" NMI and avoid
> the 'Dazed' message. Anything else runs the risks of starving the other
> subsystems.
I agree.
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists