[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141217001711.GA7152@mail.thefacebook.com>
Date: Tue, 16 Dec 2014 16:17:11 -0800
From: Calvin Owens <calvinowens@...com>
To: Borislav Petkov <bp@...en8.de>
CC: "Luck, Tony" <tony.luck@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kernel-team@...com" <kernel-team@...com>
Subject: Re: [PATCH] x86: mce: Avoid timer double-add during CMCI interrupt
storms.
On Thursday 12/11 at 15:43 +0100, Borislav Petkov wrote:
> On Wed, Dec 10, 2014 at 06:34:38PM -0800, Calvin Owens wrote:
> > This doens't fix the original issue though: the timer double-add can
> > still happen if the CMCI interrupt that hits the STORM threshold pops
> > off during mce_timer_fn() and calls mce_timer_kick().
>
> Right, I guess we could do something similar to what you proposed
> already (diff below ontop of my previous diff).
That looks good, thanks :)
> > The polling is unnecessary on a CMCI-capable machine except in STORMs,
> > right? Wouldn't it be better to eliminate it in that case?
>
> Right, the problem with error thresholding is that it only counts errors
> and raises the APIC interrupt when a certain count has been reached.
> But there's no way for the software to look at *each* error and try to
> discover trends and stuff.
>
> Thus, we've been working on something which looks at each memory error,
> collects them and then decays them over time and acts only for the
> significant ones by poisoning pages:
>
> https://urldefense.proofpoint.com/v1/url?u=https://lkml.kernel.org/r/1404242623-10094-1-git-send-email-bp%40alien8.de&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=oEb120Cp%2FehdhuY2M7qjelK5yT8IPB5WC2nEG4obDh4%3D%0A&m=Uvvje79YqtSSsaolAYcJ9N3FPGiWTIz7feUxwbhAkNc%3D%0A&s=c88a2b8458e7f07ba887d26414f2ce5b80f9180353d0895a7f4f8b47d19dbec8
>
> When we have that thing in place we can probably even go a step further
> and simply disable error thresholding because it simply doesn't give us
> what we need.
>
> Btw, we would very much welcome your thoughs on this collector thing and
> whether it would be usable in large installations. Or if not, what else
> would it need added.
Ooh I like this, I'll add some thoughts over in that thread.
> > That said, I ran mce-test with your patch on top of 3.18, looks good
> > there. But that doesn't simulate CMCI interrupts.
>
> Thanks, I had a machine here which was the perfect natural error
> injector (has faulty DIMMs). I need to dig it out and play with it again
> :)
>
> Thanks.
>
> ---
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index e3f426698164..8cf2f486b81e 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1331,6 +1331,24 @@ int ce_error_seen(void)
> return test_and_clear_bit(0, v);
> }
>
> +static void __restart_timer(struct timer_list *t, unsigned long interval)
> +{
> + unsigned long when = jiffies + interval;
> + unsigned long flags;
> +
> + local_irq_save(flags);
> +
> + if (timer_pending(t)) {
> + if (time_before(when, t->expires))
> + mod_timer_pinned(t, when);
> + } else {
> + t->expires = round_jiffies(when);
> + add_timer_on(t, smp_processor_id());
> + }
> +
> + local_irq_restore(flags);
> +}
> +
> static void mce_timer_fn(unsigned long data)
> {
> struct timer_list *t = this_cpu_ptr(&mce_timer);
> @@ -1362,8 +1380,7 @@ static void mce_timer_fn(unsigned long data)
>
> done:
> __this_cpu_write(mce_next_interval, iv);
> - t->expires = jiffies + iv;
> - add_timer_on(t, cpu);
> + __restart_timer(t, iv);
> }
>
> /*
> @@ -1372,16 +1389,10 @@ done:
> void mce_timer_kick(unsigned long interval)
> {
> struct timer_list *t = this_cpu_ptr(&mce_timer);
> - unsigned long when = jiffies + interval;
> unsigned long iv = __this_cpu_read(mce_next_interval);
>
> - if (timer_pending(t)) {
> - if (time_before(when, t->expires))
> - mod_timer_pinned(t, when);
> - } else {
> - t->expires = round_jiffies(when);
> - add_timer_on(t, smp_processor_id());
> - }
> + __restart_timer(t, interval);
> +
> if (interval < iv)
> __this_cpu_write(mce_next_interval, interval);
> }
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists