[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YFHWNWBAQ4rsyAMG@hirez.programming.kicks-ass.net>
Date: Wed, 17 Mar 2021 11:13:09 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Kim Phillips <kim.phillips@....com>, Jiri Olsa <jolsa@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Tom Lendacky <thomas.lendacky@....com>, x86@...nel.org,
lkml <linux-kernel@...r.kernel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Stanislav Kozina <skozina@...hat.com>,
Michael Petlan <mpetlan@...hat.com>,
Pierre Amadio <pamadio@...hat.com>, onatalen@...hat.com,
darcari@...hat.com, "Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: unknown NMI on AMD Rome
On Wed, Mar 17, 2021 at 09:48:29AM +0100, Ingo Molnar wrote:
> > https://developer.amd.com/wp-content/resources/56323-PUB_0.78.pdf
>
> So:
>
>
> 1215 IBS (Instruction Based Sampling) Counter Valid Value
> May be Incorrect After Exit From Core C6 (CC6) State
>
> Description
>
> If a core's IBS feature is enabled and configured to generate an interrupt, including NMI (Non-Maskable
> Interrupt), and the IBS counter overflows during the entry into the Core C6 (CC6) state, the interrupt may be
> issued, but an invalid value of the valid bit may be restored when the core exits CC6.
> Potential Effect on System
>
> The operating system may receive interrupts due to an IBS counter event, including NMI, and not observe an
> valid IBS register. Console messages indicating "NMI received for unknown reason" have been observed on
> Linux systems.
>
> Suggested Workaround: None
> Fix Planned: No fix planned
Should be simple enough to disable CC6 while IBS is in use. Kim, can you
please make that happen?
Powered by blists - more mailing lists