[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1417746575-23299-1-git-send-email-calvinowens@fb.com>
Date: Thu, 4 Dec 2014 18:29:35 -0800
From: Calvin Owens <calvinowens@...com>
To: Tony Luck <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>
CC: <x86@...nel.org>, <linux-edac@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <kernel-team@...com>,
Calvin Owens <calvinowens@...com>
Subject: [PATCH] x86: mce: Avoid timer double-add during CMCI interrupt storms.
The Intel CMCI interrupt handler calls mce_timer_kick() to force more
frequent polling for MCE events when a CMCI storm occurs and CMCI
interrupts are subsequently disabled.
If a CMCI interrupt storm happens to be detected while the timer
interrupt is executing timer functions, mce_timer_kick() can race with
mce_timer_fn(), which results in a double-add and the following BUG:
#0 [ffff88047fda3ad0] machine_kexec at ffffffff8102bdf5
#1 [ffff88047fda3b20] crash_kexec at ffffffff8109e788
#2 [ffff88047fda3bf0] oops_end at ffffffff815f20e8
#3 [ffff88047fda3c20] die at ffffffff81005c08
#4 [ffff88047fda3c50] do_trap at ffffffff815f192b
#5 [ffff88047fda3cb0] do_invalid_op at ffffffff81002f42
#6 [ffff88047fda3d60] invalid_op at ffffffff815fa668
[exception RIP: add_timer_on+234]
RIP: ffffffff8104d05a RSP: ffff88047fda3e18 RFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88047fdacbc0 RCX: 000000001fbee3ff
RDX: ffff88047fda0000 RSI: 000000000000001d RDI: ffff88047fdacbc0
RBP: ffff88047fda3e58 R8: 0000000000000000 R9: ffffffff81aa0940
R10: 0720072007200720 R11: 0720072007200765 R12: ffff880474a6c000
R13: 0000000000000101 R14: 000000000000001d R15: ffff88047fdacbc0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff88047fda3e60] mce_timer_fn at ffffffff8101f524
#8 [ffff88047fda3e80] call_timer_fn at ffffffff8104b4fa
#9 [ffff88047fda3ec0] run_timer_softirq at ffffffff8104ce70
The timer_add() in mce_timer_kick() is actually unnecessary: since the
timer is re-added by its handler function, the only case in which the
timer doesn't exist is when the CMCI interrupt calls mce_timer_kick() in
the interval between the timer firing and mce_timer_fn() actually being
executed. Thus, the timer work will be performed by mce_timer_fn() just
after the interrupt exits.
This patch removes the add_timer() from mce_timer_kick(), and disables
local interrupts during mce_timer_fn() so that mce_timer_fn() will
always pick up the timer interval value that mce_timer_kick() drops
in the PERCPU variable.
This means that the CMCI interrupt that hits the storm threshold will
call mce_timer_kick() either:
1) In the interval between the mce_timer firing and mce_timer_fn()
disabling local IRQs. In this case, mce_timer_fn() will
immediately execute after the CMCI handler exits, and will
use the interval loaded in the PERCPU variable from
mce_timer_kick() to calculate its next timer interval.
2) Happen after mce_timer_fn() has done its work, in which case
the existing timer will be updated with the new interval if
it is before the existing one.
Signed-off-by: Calvin Owens <calvinowens@...com>
---
arch/x86/kernel/cpu/mcheck/mce.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 61a9668ce..7074a90 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1286,7 +1286,7 @@ static int cmc_error_seen(void)
static void mce_timer_fn(unsigned long data)
{
struct timer_list *t = this_cpu_ptr(&mce_timer);
- unsigned long iv;
+ unsigned long iv, flags;
int notify;
WARN_ON(smp_processor_id() != data);
@@ -1301,6 +1301,9 @@ static void mce_timer_fn(unsigned long data)
* Alert userspace if needed. If we logged an MCE, reduce the
* polling interval, otherwise increase the polling interval.
*/
+
+ local_irq_save(flags);
+
iv = __this_cpu_read(mce_next_interval);
notify = mce_notify_irq();
notify |= cmc_error_seen();
@@ -1316,6 +1319,8 @@ static void mce_timer_fn(unsigned long data)
t->expires = jiffies + iv;
add_timer_on(t, smp_processor_id());
}
+
+ local_irq_restore(flags);
}
/*
@@ -1330,9 +1335,6 @@ void mce_timer_kick(unsigned long interval)
if (timer_pending(t)) {
if (time_before(when, t->expires))
mod_timer_pinned(t, when);
- } else {
- t->expires = round_jiffies(when);
- add_timer_on(t, smp_processor_id());
}
if (interval < iv)
__this_cpu_write(mce_next_interval, interval);
--
2.1.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists