[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100902131112.GR22783@erda.amd.com>
Date: Thu, 2 Sep 2010 15:11:12 +0200
From: Robert Richter <robert.richter@....com>
To: Stephane Eranian <eranian@...gle.com>
CC: Don Zickus <dzickus@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...e.hu" <mingo@...e.hu>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 4/4] [x86] perf: fix accidentally ack'ing a second
event on intel perf counter
On 02.09.10 04:13:19, Stephane Eranian wrote:
> Robert,
>
> Do you have the test program you used to test this?
> I believe the NHM hack does not solve the problem, it
> just makes it harder to appear.
For testing back-to-back nmis I have used:
perf record -e cycles -e instructions -e cache-references
-e cache-misses -e branch-misses -a -- sleep 10
with load on all cpus. But I couldn't reproduce this particular
problem as I do not have such a system available. I think it might
trigger also with only one counter running. What the observed from the
status bits, only one counter was involved.
>
> I suspect the real issue is that the GLOBAL_STATUS
> bitmask cannot be trusted. I'd like to verify this.
So yes, it looks like it is a cpu bug with a race then clearing the
status. I didn't check the errata list, maybe it is already known.
>
> Has the problem appear only on Nehalem or also on
> Westmere?
I don't know.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists