[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZQ1mdoMBJd4PCvZa@gmail.com>
Date: Fri, 22 Sep 2023 12:03:34 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Sandipan Das <sandipan.das@....com>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
x86@...nel.org, peterz@...radead.org, leitao@...ian.org,
mingo@...hat.com, acme@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
namhyung@...nel.org, irogers@...gle.com, adrian.hunter@...el.com,
tglx@...utronix.de, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, leit@...com, dcostantino@...a.com,
jhladky@...hat.com, eranian@...gle.com, ananth.narayan@....com,
ravi.bangoria@....com, santosh.shukla@....com
Subject: Re: rom 3540f985652f41041e54ee82aa53e7dbd55739ae Mon Sep 17 00:00:00
2001
* Sandipan Das <sandipan.das@....com> wrote:
> Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI
> handler, as shown below, several times while perf runs. A simple
> `perf top` run is enough to render the system unusable.
>
> WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
>
> This happens because the Performance Counter Global Status Register
> (PerfCntGlobalStatus) has one or more bits set which are considered
> reserved according to the "AMD64 Architecture Programmer???s Manual,
> Volume 2: System Programming, 24593". The document can be found at
> https://www.amd.com/system/files/TechDocs/24593.pdf
>
> To make this less intrusive, warn just once if any reserved bit is set
> and prompt the user to update the microcode. Also sanitize the value to
> what the code is handling, so that the overflow events continue to be
> handled for the number of counters that are known to be sane.
>
> Going forward, the following microcode patch levels are recommended
> for Zen 4 processors in order to avoid such issues with reserved bits.
>
> Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e
> Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e
> Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116
> Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212
>
> Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from
> the linux-firmware tree has binaries that meet the minimum required
> patch levels.
>
> Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
> Reported-by: Jirka Hladky <jhladky@...hat.com>
> Signed-off-by: Breno Leitao <leitao@...ian.org>
> [sandipan: add message to prompt users to update microcode]
> [sandipan: rework commit message and call out required microcode levels]
> Signed-off-by: Sandipan Das <sandipan.das@....com>
> v2:
> - Use pr_warn_once() instead of WARN_ON_ONCE() to prompt users to
> update microcode
> - Rework commit message and add details of minimum required microcode
> patch levels.
1)
I don't think you ever re-sent this patch with the correct subject line.
( Or at least it's not in my mbox. )
2)
So if the fix is from Breno Leitao originally, then there should be a:
From: Breno Leitao <leitao@...ian.org>
at the beginning of the patch to make authorship clear.
You might also want to add:
Co-developed-by: Sandipan Das <sandipan.das@....com>
to make your contributions clear.
Thanks,
Ingo
Powered by blists - more mailing lists