lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.20.13.2103161945380.17743@monopod.intra.ispras.ru>
Date:   Tue, 16 Mar 2021 19:48:19 +0300 (MSK)
From:   Alexander Monakov <amonakov@...ras.ru>
To:     Adam Borowski <kilobyte@...band.pl>
cc:     Jiri Olsa <jolsa@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Tom Lendacky <thomas.lendacky@....com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>, x86@...nel.org,
        lkml <linux-kernel@...r.kernel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Stanislav Kozina <skozina@...hat.com>,
        Michael Petlan <mpetlan@...hat.com>,
        Pierre Amadio <pamadio@...hat.com>, onatalen@...hat.com,
        darcari@...hat.com
Subject: Re: unknown NMI on AMD Rome

On Tue, 16 Mar 2021, Adam Borowski wrote:

> On Tue, Mar 16, 2021 at 04:45:02PM +0100, Jiri Olsa wrote:
> > hi,
> > when running 'perf top' on AMD Rome (/proc/cpuinfo below)
> > with fedora 33 kernel 5.10.22-200.fc33.x86_64
> > 
> > we got unknown NMI messages:
> > 
> > [  226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90.
> > [  226.700162] Do you have a strange power saving mode enabled?
> > [  226.700163] Dazed and confused, but trying to continue
> > 
> > also when discussing ths with Borislav, he managed to reproduce easily
> > on his AMD Rome machine
> 
> Likewise, 3c on Pinnacle Ridge.

I've also seen it on Renoir, and it appears related to PMU interrupt racing
against C-state entry/exit. Disabling C2 and C3 via 'cpupower' is enough to
avoid those NMIs in my case.

IIRC there were a few patches related to this area from AMD in the past.

Alexander

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ