[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1288682858.12061.105.camel@minggr.sh.intel.com>
Date: Tue, 02 Nov 2010 15:27:38 +0800
From: Lin Ming <ming.m.lin@...el.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...hat.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Arjan van de Ven <arjan@...radead.org>,
Stephane Eranian <eranian@...gle.com>, robert.richter@....com,
Cyrill Gorcunov <gorcunov@...il.com>, paulus@...ba.org,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
CoreyAshford <cjashfor@...ux.vnet.ibm.com>,
lkml <linux-kernel@...r.kernel.org>
Subject: [DRAFT PATCH 0/3] perf: Add Intel Nehalem uncore pmu support
Hi, all
Here is the draft patch to add Intel Nehalem uncore pmu support.
It's not fully functional, but I threw it out early to get comments.
For the background of Nehalem uncore pmu, see Intel SDM Volume 3B
"30.6.2 Performance Monitoring Facility in the Uncore"
1. data structure
struct node_hw_events {
struct perf_event *events[UNCORE_NUM_COUNTERS];
int n_events;
struct spinlock lock;
int enabled;
};
struct node_hw_events is the per node structure.
"lock" protects add/delete events to uncore pmu.
struct uncore_cpu_hw_events {
unsigned long active_mask[BITS_TO_LONGS(UNCORE_NUM_COUNTERS)];
};
struct uncore_cpu_hw_events is the per logical cpu structure.
"active_mask" represents the counters used by the cpu.
For example, if bit 3, 6 are set for cpuX, then it means uncore counter
3 and 6 are used by cpuX.
2. Uncore pmu NMI handling
Every core in the socket can be programmed to receive uncore counter
overflow interrupt.
In this draft implementation, each core handles the overflow interrupt
caused by the counters with bit set in "active_mask".
3. perf tool update
In this draft, the uncore events are monitored with raw events with "ru"
prefix("u" for uncore).
./perf stat -e ru0101 -- ls
Performance counter stats for 'ls':
795920 raw 0x101
0.002110130 seconds time elapsed
4. Issues
How to eliminate the duplicate counter values accumulated by multi child
processes on the same socket?
perf stat -e ru0101 -- make -j4
Assume the 4 "make" child processes are running on the same socket and
counting uncore raw event "0101", and the counter value read by them are
val0, val1, val2, val3.
Then the final counter result given by "perf stat" will be "val0 + val1
+ val2 + val3".
But this is obvious wrong, because the uncore counter is shared by all
cores in the socket, so the final result should not be accumulated.
Any comment is very appreciated.
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/perf_event.c | 30 ++-
arch/x86/kernel/cpu/perf_event_intel.c | 4 +-
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 280 +++++++++++++++++++++++++
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 80 +++++++
arch/x86/kernel/cpu/perf_event_p4.c | 2 +-
include/linux/perf_event.h | 1 +
tools/perf/util/parse-events.c | 14 +-
8 files changed, 394 insertions(+), 18 deletions(-)
Thanks,
Lin Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists