lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1470928902-31196-1-git-send-email-matt@codeblueprint.co.uk>
Date:	Thu, 11 Aug 2016 16:21:42 +0100
From:	Matt Fleming <matt@...eblueprint.co.uk>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org,
	Matt Fleming <matt@...eblueprint.co.uk>,
	Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
	stable@...r.kernel.org
Subject: [PATCH] perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2

While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.

This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,

  $ perf stat -e cache-references,cache-misses

measure completely different things.

Instead, make the AMD PMU measure instruction/data cache fill requests
to the L2 and instruction/data cache misses in the L2 when
HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled, respectively.
That way the events measure unified caches on both platforms.

Signed-off-by: Matt Fleming <matt@...eblueprint.co.uk>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>
Cc: <stable@...r.kernel.org>
---
 arch/x86/events/amd/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index e07a22bb9308..8fd8bf79f32b 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -119,8 +119,8 @@ static const u64 amd_perfmon_event_map[PERF_COUNT_HW_MAX] =
 {
   [PERF_COUNT_HW_CPU_CYCLES]			= 0x0076,
   [PERF_COUNT_HW_INSTRUCTIONS]			= 0x00c0,
-  [PERF_COUNT_HW_CACHE_REFERENCES]		= 0x0080,
-  [PERF_COUNT_HW_CACHE_MISSES]			= 0x0081,
+  [PERF_COUNT_HW_CACHE_REFERENCES]		= 0x037d,
+  [PERF_COUNT_HW_CACHE_MISSES]			= 0x037e,
   [PERF_COUNT_HW_BRANCH_INSTRUCTIONS]		= 0x00c2,
   [PERF_COUNT_HW_BRANCH_MISSES]			= 0x00c3,
   [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND]	= 0x00d0, /* "Decoder empty" event */
-- 
2.7.3

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ