[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230616115316.3652155-1-leitao@debian.org>
Date: Fri, 16 Jun 2023 04:53:15 -0700
From: Breno Leitao <leitao@...ian.org>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>,
Sandipan Das <sandipan.das@....com>
Cc: leit@...com, dcostantino@...a.com,
linux-perf-users@...r.kernel.org (open list:PERFORMANCE EVENTS
SUBSYSTEM),
linux-kernel@...r.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM)
Subject: [PATCH] perf/x86/amd: Do not WARN on every IRQ
On some systems, the Performance Counter Global Status Register is
coming with reserved bits set, which causes the system to be unusable
if a simple `perf top` runs. The system hits the WARN() thousands times
while perf runs.
WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
This happens because the "Performance Counter Global Status Register"
(PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according
to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s
Manual, Volume 2: System Programming, 24593"[1]
WARN_ONCE if any reserved bit is set, and sanitize the value to what the
code is handling, so the overflow events continue to be handled for the
number of events that are known to be sane.
Signed-off-by: Breno Leitao <leitao@...ian.org>
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
[1] Link: https://www.amd.com/system/files/TechDocs/24593.pdf
---
arch/x86/events/amd/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index bccea57dee81..809ddb15c122 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -909,6 +909,10 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
status &= ~GLOBAL_STATUS_LBRS_FROZEN;
}
+ amd_pmu_global_cntr_mask = (1ULL << x86_pmu.num_counters) - 1;
+ WARN_ON_ONCE(status & ~amd_pmu_global_cntr_mask);
+ status &= amd_pmu_global_cntr_mask;
+
for (idx = 0; idx < x86_pmu.num_counters; idx++) {
if (!test_bit(idx, cpuc->active_mask))
continue;
--
2.34.1
Powered by blists - more mailing lists