[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTinjzZnv8dzfKPzgGtRnk5C-XsAt=gyf_4G0+gf8@mail.gmail.com>
Date: Fri, 12 Nov 2010 14:00:43 +0100
From: Stephane Eranian <eranian@...gle.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Andi Kleen <andi@...stfloor.org>,
Corey Ashford <cjashfor@...ux.vnet.ibm.com>,
Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
fweisbec@...il.com, mingo@...e.hu, acme@...hat.com,
paulus <paulus@...ba.org>
Subject: Re: [PATCH 2/3] perf: Add support for extra parameters for raw events
On Fri, Nov 12, 2010 at 12:36 PM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Fri, 2010-11-12 at 11:49 +0100, Stephane Eranian wrote:
>> The difficulty with PEBS-LL (load latency) is not so much the encoding of the
>> latency. It is more how to expose the data back to user. You get a full PEBS
>> record for each miss. So you get the PEBS machine state + data addr, miss
>> latency, and data source (where did the line come from). We have already
>> discussed how to expose machine state in general. I'll work on a patch for this.
>
> Frederic was working on this PERF_SAMPLE_REGS stuff as well for his copy
> the stack top and let dwarfs go wild at it patches.
>
Ok, I'll talk to him again about this then.
>> So we can get the general PEBS machine state out. However, the question is
>> how do we expose data addr, latency, data source? We can reuse the
>> SAMPLE_ADDR for the data address. Sample IP would point to the load
>> instruction (with help from LBR to correct the off by one issue). For
>> the latency
>
> Right, PERF_SAMPLE_IP and PERF_SAMPLE_ADDR
>
>> and data source, I proposed using pseudo regs and leveraging the sampled machine
>> state mechanism. An alternative may be to define a new record type that would b
>> generic enough to be reusable, for instance { instr_addr, data_addr,
>> latency, data_src, flags; }.
>
> I'm not sure I like the idea of pseudo regs.. I'm afraid it'll get messy
> quite quickly. Load-latency is a bit like IBS that way, terribly messy.
>
I don't understand what aspect you think is messy. When you are sampling
cache misses, you expect to get the tuple (instr addr, data addr, latency,
data source). That is what you get with AMD IBS, Nehalem PEBS-LL and
also Itanium D-EAR. I am sure IBM Power has something similar as well.
To collect this, you can either store the info in registers (AMD, Itanium)
or in a buffer (PEBS). But regardless of that you will always have to expose
the tuple. We have a solution for two out of 4 fields that reuses the existing
infrastructure. We need something else for the other two.
We should expect that in the future PMUs will collect more than code addresses.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists