lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 4 Dec 2014 18:29:35 -0800
From:	Calvin Owens <calvinowens@...com>
To:	Tony Luck <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>
CC:	<x86@...nel.org>, <linux-edac@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <kernel-team@...com>,
	Calvin Owens <calvinowens@...com>
Subject: [PATCH] x86: mce: Avoid timer double-add during CMCI interrupt storms.

The Intel CMCI interrupt handler calls mce_timer_kick() to force more
frequent polling for MCE events when a CMCI storm occurs and CMCI
interrupts are subsequently disabled.

If a CMCI interrupt storm happens to be detected while the timer
interrupt is executing timer functions, mce_timer_kick() can race with
mce_timer_fn(), which results in a double-add and the following BUG:

	#0 [ffff88047fda3ad0] machine_kexec at ffffffff8102bdf5
	#1 [ffff88047fda3b20] crash_kexec at ffffffff8109e788
	#2 [ffff88047fda3bf0] oops_end at ffffffff815f20e8
	#3 [ffff88047fda3c20] die at ffffffff81005c08
	#4 [ffff88047fda3c50] do_trap at ffffffff815f192b
	#5 [ffff88047fda3cb0] do_invalid_op at ffffffff81002f42
	#6 [ffff88047fda3d60] invalid_op at ffffffff815fa668
	[exception RIP: add_timer_on+234]
	RIP: ffffffff8104d05a RSP: ffff88047fda3e18 RFLAGS: 00010286
	RAX: 0000000000000000 RBX: ffff88047fdacbc0 RCX: 000000001fbee3ff
	RDX: ffff88047fda0000 RSI: 000000000000001d RDI: ffff88047fdacbc0
	RBP: ffff88047fda3e58 R8: 0000000000000000 R9: ffffffff81aa0940
	R10: 0720072007200720 R11: 0720072007200765 R12: ffff880474a6c000
	R13: 0000000000000101 R14: 000000000000001d R15: ffff88047fdacbc0
	ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
	#7 [ffff88047fda3e60] mce_timer_fn at ffffffff8101f524
	#8 [ffff88047fda3e80] call_timer_fn at ffffffff8104b4fa
	#9 [ffff88047fda3ec0] run_timer_softirq at ffffffff8104ce70

The timer_add() in mce_timer_kick() is actually unnecessary: since the
timer is re-added by its handler function, the only case in which the
timer doesn't exist is when the CMCI interrupt calls mce_timer_kick() in
the interval between the timer firing and mce_timer_fn() actually being
executed. Thus, the timer work will be performed by mce_timer_fn() just
after the interrupt exits.

This patch removes the add_timer() from mce_timer_kick(), and disables
local interrupts during mce_timer_fn() so that mce_timer_fn() will
always pick up the timer interval value that mce_timer_kick() drops
in the PERCPU variable.

This means that the CMCI interrupt that hits the storm threshold will
call mce_timer_kick() either:

	1) In the interval between the mce_timer firing and mce_timer_fn()
	   disabling local IRQs. In this case, mce_timer_fn() will
	   immediately execute after the CMCI handler exits, and will
	   use the interval loaded in the PERCPU variable from
	   mce_timer_kick() to calculate its next timer interval.

	2) Happen after mce_timer_fn() has done its work, in which case
	   the existing timer will be updated with the new interval if
	   it is before the existing one.

Signed-off-by: Calvin Owens <calvinowens@...com>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 61a9668ce..7074a90 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1286,7 +1286,7 @@ static int cmc_error_seen(void)
 static void mce_timer_fn(unsigned long data)
 {
 	struct timer_list *t = this_cpu_ptr(&mce_timer);
-	unsigned long iv;
+	unsigned long iv, flags;
 	int notify;
 
 	WARN_ON(smp_processor_id() != data);
@@ -1301,6 +1301,9 @@ static void mce_timer_fn(unsigned long data)
 	 * Alert userspace if needed.  If we logged an MCE, reduce the
 	 * polling interval, otherwise increase the polling interval.
 	 */
+
+	local_irq_save(flags);
+
 	iv = __this_cpu_read(mce_next_interval);
 	notify = mce_notify_irq();
 	notify |= cmc_error_seen();
@@ -1316,6 +1319,8 @@ static void mce_timer_fn(unsigned long data)
 		t->expires = jiffies + iv;
 		add_timer_on(t, smp_processor_id());
 	}
+
+	local_irq_restore(flags);
 }
 
 /*
@@ -1330,9 +1335,6 @@ void mce_timer_kick(unsigned long interval)
 	if (timer_pending(t)) {
 		if (time_before(when, t->expires))
 			mod_timer_pinned(t, when);
-	} else {
-		t->expires = round_jiffies(when);
-		add_timer_on(t, smp_processor_id());
 	}
 	if (interval < iv)
 		__this_cpu_write(mce_next_interval, interval);
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ