[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1289568077.2084.256.camel@laptop>
Date: Fri, 12 Nov 2010 14:21:17 +0100
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
To: Stephane Eranian <eranian@...gle.com>
Cc: Andi Kleen <andi@...stfloor.org>,
Corey Ashford <cjashfor@...ux.vnet.ibm.com>,
Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
fweisbec@...il.com, mingo@...e.hu, acme@...hat.com,
paulus <paulus@...ba.org>, Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH 2/3] perf: Add support for extra parameters for raw
events
On Fri, 2010-11-12 at 14:00 +0100, Stephane Eranian wrote:
> I don't understand what aspect you think is messy. When you are sampling
> cache misses, you expect to get the tuple (instr addr, data addr, latency,
> data source).
Its the data source thing I have most trouble with -- see below. The
latency isn't immediately clear either, I mean the larger the bubble the
more hits the instruction will get, so there should be a correlation
between samples and latency.
> That is what you get with AMD IBS, Nehalem PEBS-LL and
> also Itanium D-EAR. I am sure IBM Power has something similar as well.
> To collect this, you can either store the info in registers (AMD, Itanium)
> or in a buffer (PEBS). But regardless of that you will always have to expose
> the tuple. We have a solution for two out of 4 fields that reuses the existing
> infrastructure. We need something else for the other two.
Well, if Intel PEBS, IA64 and PPC64 all have a data source thing we can
simply add PERF_SAMPLE_SOURCE or somesuch and use that.
Do IA64/PPC64 have latency fields as well? PERF_SAMPLE_LATENCY would
seem to be the thing to use in that case.
BTW, what's the status of perf on IA64? And do we really still care
about that platform, its pretty much dead isn't it?
> We should expect that in the future PMUs will collect more than code addresses.
Sure, but I hate stuff that counts multiple events on a single base like
IBS does, and LL is similar to that, its a fetch retire counter and then
you report where fetch was satisfied from. So in effect you're measuring
l1/l2/l3/dram hit/miss all at the same time but on a fetch basis.
Note that we need proper userspace for such crap as well, and libpfm
doesn't count, we need a full analysis tool in perf itself.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists