lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180604082414.GO12217@hirez.programming.kicks-ass.net>
Date:   Mon, 4 Jun 2018 10:24:14 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Arcari <darcari@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Donald Zickus <dzickus@...hat.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Jerry Hoemann <jerry.hoemann@....com>
Subject: Re: [PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot

On Sun, Jun 03, 2018 at 02:23:43PM -0400, David Arcari wrote:
> On some systems pressing the external NMI button is now failing to inject
> an NMI 5-10% of the time.  This causes confusion for a user that expects
> the NMI to dump the system.
> 
> Commit 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")
> does not read the firmware setting of the FREEZE_WHILE_SMM bit and will
> always clear it when the PMU is initialized.  As a result the performance
> counters will always run and that greatly expands the race in which
> external NMI will not be processed if a local NMI is already being
> processed.
> 
> One option is to change default_do_nmi().  The code snippet below shows the
> relevant portion of a patch that resolves the issue, but it is problematic
> from a performance perspective and was dismissed.
> 
> -345,7 +345,17 @@ static void default_do_nmi(struct pt_regs *regs)
>  		 */
>  		if (handled > 1)
>  			__this_cpu_write(swallow_nmi, true);
> -		return;
> +
> +		/*
> +		 * Unfortunately, there is a race condition which can
> +		 * result in a missing an external NMI.  Typically, an
> +		 * external NMI is processed on cpu 0.  Therefore, on
> +		 * cpu 0 check for an external NMI before returning.
> +		 */
> +		if (smp_processor_id() ||
> +		    (x86_platform.get_nmi_reason() & NMI_REASON_MASK) == 0) {
> +			return;
> +		}
>  	}
> 
> Ultimately, the issue can be resolved by storing the default firmware
> setting of FREEZE_WHILE_SMM before initializing the PMU.

I'm sorry, I know it's Monday morning, but what?! I really don't
understand anything you write there.

Maybe if you explain the race and how your proposed fix closes it things
will make sense. The above refers to too many things not here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ