linux-kernel - [DRAFT PATCH 0/3] perf: Add Intel Nehalem uncore pmu support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <1288682858.12061.105.camel@minggr.sh.intel.com>
Date:	Tue, 02 Nov 2010 15:27:38 +0800
From:	Lin Ming <ming.m.lin@...el.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...hat.com>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Arjan van de Ven <arjan@...radead.org>,
	Stephane Eranian <eranian@...gle.com>, robert.richter@....com,
	Cyrill Gorcunov <gorcunov@...il.com>, paulus@...ba.org,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	CoreyAshford <cjashfor@...ux.vnet.ibm.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: [DRAFT PATCH 0/3] perf: Add Intel Nehalem uncore pmu support

Hi, all

Here is the draft patch to add Intel Nehalem uncore pmu support.
It's not fully functional, but I threw it out early to get comments.

For the background of Nehalem uncore pmu, see Intel SDM Volume 3B
"30.6.2 Performance Monitoring Facility in the Uncore"

1. data structure

struct node_hw_events {
        struct perf_event *events[UNCORE_NUM_COUNTERS];
        int n_events;
        struct spinlock lock;
        int enabled;
};

struct node_hw_events is the per node structure.
"lock" protects add/delete events to uncore pmu.

struct uncore_cpu_hw_events {
        unsigned long active_mask[BITS_TO_LONGS(UNCORE_NUM_COUNTERS)];
};

struct uncore_cpu_hw_events is the per logical cpu structure.
"active_mask" represents the counters used by the cpu.
For example, if bit 3, 6 are set for cpuX, then it means uncore counter
3 and 6 are used by cpuX.

2. Uncore pmu NMI handling

Every core in the socket can be programmed to receive uncore counter
overflow interrupt.

In this draft implementation, each core handles the overflow interrupt
caused by the counters with bit set in "active_mask".

3. perf tool update

In this draft, the uncore events are monitored with raw events with "ru"
prefix("u" for uncore).

./perf stat -e ru0101 -- ls

Performance counter stats for 'ls':

             795920  raw 0x101               

        0.002110130  seconds time elapsed

4. Issues

How to eliminate the duplicate counter values accumulated by multi child
processes on the same socket?

perf stat -e ru0101 -- make -j4

Assume the 4 "make" child processes are running on the same socket and
counting uncore raw event "0101", and the counter value read by them are
val0, val1, val2, val3.

Then the final counter result given by "perf stat" will be "val0 + val1
+ val2 + val3".

But this is obvious wrong, because the uncore counter is shared by all
cores in the socket, so the final result should not be accumulated.

Any comment is very appreciated.

 arch/x86/include/asm/msr-index.h              |    1 +
 arch/x86/kernel/cpu/perf_event.c              |   30 ++-
 arch/x86/kernel/cpu/perf_event_intel.c        |    4 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |  280 +++++++++++++++++++++++++
 arch/x86/kernel/cpu/perf_event_intel_uncore.h |   80 +++++++
 arch/x86/kernel/cpu/perf_event_p4.c           |    2 +-
 include/linux/perf_event.h                    |    1 +
 tools/perf/util/parse-events.c                |   14 +-
 8 files changed, 394 insertions(+), 18 deletions(-)

Thanks,
Lin Ming

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/