linux-kernel - Re: [PATCH] perf/x86: read the FREEZE_WHILE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180612165635.GT12198@hirez.programming.kicks-ass.net>
Date:   Tue, 12 Jun 2018 18:56:35 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Arcari <darcari@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Donald Zickus <dzickus@...hat.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Jerry Hoemann <jerry.hoemann@....com>
Subject: Re: [PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot

On Mon, Jun 04, 2018 at 10:12:20AM -0400, David Arcari wrote:
> default_do_nmi() will process both perf events (local interrupts) as well as
> external interrupts (such as the NMI button).  The handler is coded such that
> if a local interrupt occurs, no check is made for an external interrupt.
> Therefore, if the two interrupts occur simultaneously, the external interrupt
> is lost.

ACK, NMI handling on x86 is less than ideal.

> The code above, which was ultimately discounted, attempts to avoid this
> scenario with as little performance impact as possible by reading the register
> without the spinlock for cpu 0 only (currently only cpu 0 can handle an
> external NMI, I verified this on my system by testing the NMI button with
> cpu 0 offline).
> 
> The code above is problematic for a number of reasons not the least of which
> is performance.  Furthermore, I don't see a less intrusive solution wrt
> do_default_nmi().

Right, because reading the register itself is dog slow IIRC.

> Upstream 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")
> appears to have made it relatively easy to hit this race condition.  On some
> systems, this commit has resulted in a change to the default firmware setting
> of DEBUGCTLMSR_FREEZE_IN_SMM_BIT (it is now cleared by the OS by default).
> 
> With this bit cleared, the following situation occurs:
> 
> 1) external NMI - due to io check
> 2) long duration SMI (counters do not freeze)
> 3) NMI handler runs and misattributes interrupt to perf event

I think 3 is wrong, because 2 will in fact have triggered a PMI (due to
long running) so 3 will observe a PMI and claim the NMI. No
misattribution what so ever.

Because if this wasn't the case, flipping FREEZE_IN_SMM wouldn't have
made a difference.

> Ultimately, my solution was to restore the previous behavior by reading and
> storing the firmware setting of the bit rather than to always clear it.

Ah, urgh.. what a mess. So the OS setting the bit to a known and
consistent value is 'good' IMO. The firmware magically frobbing things
is 'bad'.

Now, explain to me why an IO-check results in an external NMI, and why
there are long running SMI handlers around? Why can't the IO error not
be propagated through the regular device interrupt/state? Why are long
running SMIs required at all, ever? Why doesn't the OS handler whatever
it is the SMM does?

Are you not solving the wrong problem here?