linux-kernel - Re: [RFC PATCH] perf: Add load latency monitoring on Intel Nehalem/Westmere

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1293104270.2170.580.camel@laptop>
Date:	Thu, 23 Dec 2010 12:37:50 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	Lin Ming <ming.m.lin@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Andi Kleen <andi@...stfloor.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Arjan van de Ven <arjan@...radead.org>,
	lkml <linux-kernel@...r.kernel.org>, paulus <paulus@...ba.org>
Subject: Re: [RFC PATCH] perf: Add load latency monitoring on Intel
 Nehalem/Westmere

On Thu, 2010-12-23 at 12:05 +0100, Stephane Eranian wrote:

> > Value   Intel                           Perf
> > 0x0     Unknown L3                      Unknown
> >
> > 0x1     L1                              L1-local
> >
> > 0x2     Pending core cache HIT          L2-snoop
> >        Outstanding core cache miss to
> 
> Not clear how you know this is snoop or L2?
> 
> I suspect this one is saying you have a request for a line
> for which there is already a pending request underway. Could
> be the first came from prefetchers, the 2nd is actual demand.
> 
> Let me check with Intel. The table is unclear.

Right, so cache snoops as used by Intel are data transfer operations
(not only the watching for remote modifications and local invalidation
as per the strict definition), they typically short-circuit a complete
fetch, get it from a neighboring cache, or otherwise in-flight data.

Since this is a pending fetch, the data is in-flight, and snoop seemed
to apply, but I admit it is somewhat of a stretch.

The L2 came from the usage of "core cache", I might be wrong on that.

Anyway, its a bit of an odd one out, you can have the exact same
'problem' of pending fetches on the same line on all levels, yet they
don't provide this 'source' for other levels.

Strictly speaking, this is a stall, not a source, and we could simply
map it to 'unknown' and be done with it.

> >        the same line was underway
> > 0x3     L2                              L2-local
> >
> > 0x4     L3-snoop, no coherency actions  L3-snoop-I
> 
> I am not sure I understand what you mean by local vs. remote
> in your terminology.

Local being the cache nearest to the cpu, remote being all others.

Admittedly that doesn't really make too much sense for L[12], but
imagine threads having their own L1, then I could imagine a thread
trying to peek in a sibling's L1 since its so near. In that case it
would make sense to use local vs remote on the L1.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/