lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.20.13.2103171619320.17743@monopod.intra.ispras.ru>
Date:   Wed, 17 Mar 2021 16:32:17 +0300 (MSK)
From:   Alexander Monakov <amonakov@...ras.ru>
To:     Peter Zijlstra <peterz@...radead.org>
cc:     Ingo Molnar <mingo@...nel.org>,
        Kim Phillips <kim.phillips@....com>,
        Jiri Olsa <jolsa@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Tom Lendacky <thomas.lendacky@....com>, x86@...nel.org,
        lkml <linux-kernel@...r.kernel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Stanislav Kozina <skozina@...hat.com>,
        Michael Petlan <mpetlan@...hat.com>,
        Pierre Amadio <pamadio@...hat.com>, onatalen@...hat.com,
        darcari@...hat.com, "Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: unknown NMI on AMD Rome

On Wed, 17 Mar 2021, Peter Zijlstra wrote:

> On Wed, Mar 17, 2021 at 09:48:29AM +0100, Ingo Molnar wrote:
> > > https://developer.amd.com/wp-content/resources/56323-PUB_0.78.pdf
> > 
> > So:
> > 
> > 
> >   1215 IBS (Instruction Based Sampling) Counter Valid Value
> >   May be Incorrect After Exit From Core C6 (CC6) State
> > 
> >   Description
> > 
> >   If a core's IBS feature is enabled and configured to generate an interrupt, including NMI (Non-Maskable
> >   Interrupt), and the IBS counter overflows during the entry into the Core C6 (CC6) state, the interrupt may be
> >   issued, but an invalid value of the valid bit may be restored when the core exits CC6.
> >   Potential Effect on System
> > 
> >   The operating system may receive interrupts due to an IBS counter event, including NMI, and not observe an
> >   valid IBS register. Console messages indicating "NMI received for unknown reason" have been observed on
> >   Linux systems.
> > 
> >   Suggested Workaround: None
> >   Fix Planned: No fix planned
> 
> Should be simple enough to disable CC6 while IBS is in use. Kim, can you
> please make that happen?

Wouldn't that "magically" significantly speed up workloads running under
'perf top', in case they don't saturate the CPUs? Scheduling gets
much snappier if the target CPU doesn't need to wake up from deep sleep :)

Alternatively, would you consider adding the errata reference to the
printk message when IBS is in use, and rate-limit it so it doesn't
flood dmesg? Then the user will know what's going on, and may
choose to temporarily disable C-states using the 'cpupower' tool.

Alexander

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ