[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1293104270.2170.580.camel@laptop>
Date: Thu, 23 Dec 2010 12:37:50 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Stephane Eranian <eranian@...gle.com>
Cc: Lin Ming <ming.m.lin@...el.com>, Ingo Molnar <mingo@...e.hu>,
Andi Kleen <andi@...stfloor.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Arjan van de Ven <arjan@...radead.org>,
lkml <linux-kernel@...r.kernel.org>, paulus <paulus@...ba.org>
Subject: Re: [RFC PATCH] perf: Add load latency monitoring on Intel
Nehalem/Westmere
On Thu, 2010-12-23 at 12:05 +0100, Stephane Eranian wrote:
> > Value Intel Perf
> > 0x0 Unknown L3 Unknown
> >
> > 0x1 L1 L1-local
> >
> > 0x2 Pending core cache HIT L2-snoop
> > Outstanding core cache miss to
>
> Not clear how you know this is snoop or L2?
>
> I suspect this one is saying you have a request for a line
> for which there is already a pending request underway. Could
> be the first came from prefetchers, the 2nd is actual demand.
>
> Let me check with Intel. The table is unclear.
Right, so cache snoops as used by Intel are data transfer operations
(not only the watching for remote modifications and local invalidation
as per the strict definition), they typically short-circuit a complete
fetch, get it from a neighboring cache, or otherwise in-flight data.
Since this is a pending fetch, the data is in-flight, and snoop seemed
to apply, but I admit it is somewhat of a stretch.
The L2 came from the usage of "core cache", I might be wrong on that.
Anyway, its a bit of an odd one out, you can have the exact same
'problem' of pending fetches on the same line on all levels, yet they
don't provide this 'source' for other levels.
Strictly speaking, this is a stall, not a source, and we could simply
map it to 'unknown' and be done with it.
> > the same line was underway
> > 0x3 L2 L2-local
> >
> > 0x4 L3-snoop, no coherency actions L3-snoop-I
>
> I am not sure I understand what you mean by local vs. remote
> in your terminology.
Local being the cache nearest to the cpu, remote being all others.
Admittedly that doesn't really make too much sense for L[12], but
imagine threads having their own L1, then I could imagine a thread
trying to peek in a sibling's L1 since its so near. In that case it
would make sense to use local vs remote on the L1.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists