lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 9 Jul 2014 14:51:52 -0700
From:	Havard Skinnemoen <hskinnemoen@...gle.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Tony Luck <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Ewout van Bekkum <ewout@...gle.com>
Subject: Re: [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and
 CMCI reports.

On Wed, Jul 9, 2014 at 1:35 PM, Andi Kleen <andi@...stfloor.org> wrote:
> Havard Skinnemoen <hskinnemoen@...gle.com> writes:
>
>> machine_check_poll() was modified to use spin_lock_irqsave independently
>> per bank when a valid MCE is found to prevent duplicated MCE reports by
>> the CMCI and polling methods. In the common case no MCE will be found,
>> so the lock is not acquired until a valid MCE is found. The status is
>> reread after the lock is acquired in case the MCE was already handled by
>> a different thread. A unique spinlock is used per bank number, so
>> contention should be mostly limited to non-shared banks.
>
> This doesn't make sense. Banks are either owned by CMCI or by poll,
> not by both. If you have true duplicates the bug must be somewhere else.

I don't think we got the description right here. I think the real
issue here was machine check polls happening on multiple CPUs with
shared banks, all reporting the same MCEs. This is very reproducible
when booting with mce=no_cmci, since all CPUs will handle all banks,
and there's AFAICT no good way to identify shared banks without
enabling CMCI.

There may have been an interaction with CMCI here too at some point,
but it's possible that went away with the timer patch (which we did a
bit later).

Havard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ