[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.20.13.2103161945380.17743@monopod.intra.ispras.ru>
Date: Tue, 16 Mar 2021 19:48:19 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: Adam Borowski <kilobyte@...band.pl>
cc: Jiri Olsa <jolsa@...hat.com>, Borislav Petkov <bp@...en8.de>,
Tom Lendacky <thomas.lendacky@....com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>, x86@...nel.org,
lkml <linux-kernel@...r.kernel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Stanislav Kozina <skozina@...hat.com>,
Michael Petlan <mpetlan@...hat.com>,
Pierre Amadio <pamadio@...hat.com>, onatalen@...hat.com,
darcari@...hat.com
Subject: Re: unknown NMI on AMD Rome
On Tue, 16 Mar 2021, Adam Borowski wrote:
> On Tue, Mar 16, 2021 at 04:45:02PM +0100, Jiri Olsa wrote:
> > hi,
> > when running 'perf top' on AMD Rome (/proc/cpuinfo below)
> > with fedora 33 kernel 5.10.22-200.fc33.x86_64
> >
> > we got unknown NMI messages:
> >
> > [ 226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90.
> > [ 226.700162] Do you have a strange power saving mode enabled?
> > [ 226.700163] Dazed and confused, but trying to continue
> >
> > also when discussing ths with Borislav, he managed to reproduce easily
> > on his AMD Rome machine
>
> Likewise, 3c on Pinnacle Ridge.
I've also seen it on Renoir, and it appears related to PMU interrupt racing
against C-state entry/exit. Disabling C2 and C3 via 'cpupower' is enough to
avoid those NMIs in my case.
IIRC there were a few patches related to this area from AMD in the past.
Alexander
Powered by blists - more mailing lists