[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1479082460.992673055@decadent.org.uk>
Date: Mon, 14 Nov 2016 00:14:20 +0000
From: Ben Hutchings <ben@...adent.org.uk>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC: akpm@...ux-foundation.org, "Thomas Gleixner" <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...nel.org>, "Borislav Petkov" <bp@...en8.de>,
"Matt Fleming" <matt@...eblueprint.co.uk>,
"Peter Zijlstra" <peterz@...radead.org>,
"Linus Torvalds" <torvalds@...ux-foundation.org>
Subject: [PATCH 3.16 297/346] perf/x86/amd: Make HW_CACHE_REFERENCES and
HW_CACHE_MISSES measure L2
3.16.39-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Matt Fleming <matt@...eblueprint.co.uk>
commit 080fe0b790ad438fc1b61621dac37c1964ce7f35 upstream.
While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.
This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,
$ perf stat -e cache-references,cache-misses
measure completely different things.
Instead, make the AMD PMU measure instruction/data cache and TLB fill
requests to the L2 and instruction/data cache and TLB misses in the L2
when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled,
respectively. That way the events measure unified caches on both
platforms.
Signed-off-by: Matt Fleming <matt@...eblueprint.co.uk>
Acked-by: Peter Zijlstra <peterz@...radead.org>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@...nel.org>
[bwh: Backported to 3.16:
- Drop KVM PMU changes
- Adjust filename]
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -119,8 +119,8 @@ static const u64 amd_perfmon_event_map[]
{
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
- [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080,
- [PERF_COUNT_HW_CACHE_MISSES] = 0x0081,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x077e,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */
Powered by blists - more mailing lists