[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c86c4470812092103geca7df8ge9293de6de3f83ac@mail.gmail.com>
Date: Wed, 10 Dec 2008 06:03:05 +0100
From: "stephane eranian" <eranian@...glemail.com>
To: "Paul Mackerras" <paulus@...ba.org>
Cc: "Andi Kleen" <andi@...stfloor.org>, "Ingo Molnar" <mingo@...e.hu>,
linux-kernel@...r.kernel.org,
"Thomas Gleixner" <tglx@...utronix.de>, linux-arch@...r.kernel.org,
"Andrew Morton" <akpm@...ux-foundation.org>,
"Eric Dumazet" <dada1@...mosbay.com>,
"Robert Richter" <robert.richter@....com>,
"Arjan van de Veen" <arjan@...radead.org>,
"Peter Anvin" <hpa@...or.com>,
"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
"Steven Rostedt" <rostedt@...dmis.org>,
"David Miller" <davem@...emloft.net>,
"Paolo Ciarrocchi" <paolo.ciarrocchi@...il.com>
Subject: Re: [patch] Performance Counters for Linux, v2
Paul,
On Wed, Dec 10, 2008 at 5:44 AM, Paul Mackerras <paulus@...ba.org> wrote:
> Andi Kleen writes:
>
>> When you say counting you should also include "event ring buffers with
>> metadata", like PEBS on Intel x86.
>
> I'm not familiar with PEBS. Maybe it's something different again,
> neither sampling nor counting, but a third thing?
>
PEBS is an Intel only feature used for sampling. However, this time
the hardware (and microcode) does the sampling for you. You point
the CPU to a structure in memory, called DS, which then points to
a region of memory you can designate, i.e., the sampling buffer.
Buffer can be any size you want.
Then you program counter0 when an event and a sampling period.
When the counter overflows, there is no interrupt, the microcode
records the RIP and full machine state, and reloads the counter with
the period specified in DS. The OS gets an interrupt ONLY when the
buffer fills up. Overhead is thus minimized, but you have no control
over the format of the samples. The precision (P) comes from the fact
that the RIP is guaranteed to point the an instruction that is just after
an instruction which generated the event you're sampling on. The catch
is that no all events support PEBS, and only one counter works with PEBS
on Core 2. Nehalem is better, more events support PEBS, all 4 generic
counters do support PEBS. Furthermore,PEBS can now capture where
cache misses occur, very much like what Itanium can do.
Needless to say all of this is supported by perfmon.
Hope this helps.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists