[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFQmdRb9vsWyF06jppS5U7Wzuc+SzRgWL+hs5+es-GC=5e_8qg@mail.gmail.com>
Date: Wed, 9 Jul 2014 16:00:04 -0700
From: Havard Skinnemoen <hskinnemoen@...gle.com>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Borislav Petkov <bp@...en8.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Ewout van Bekkum <ewout@...gle.com>
Subject: Re: [PATCH 5/6] x86-mce: check if no_way_out applies before deciding
not to clear MCE banks.
On Wed, Jul 9, 2014 at 2:00 PM, Luck, Tony <tony.luck@...el.com> wrote:
> + if (!(no_way_out && cfg->tolerant < 3))
> mce_clear_state(toclear);
>
> Style - I think this is easier to grok:
>
> if (!no_way_out || cfg->tolerant >=3)
> mce_clear_state(toclear);
>
> but not too strongly if other like !(a && b) form.
I tend to agree with you. It came up during our internal review, and
others argued the other way. But since I'm in charge now, I'll change
it back ;-)
> I'm never sure how to treat the crazy levels of "tolerant" though. Do
> we really want to clear the banks? In one sense we do ... we are still
> running and might see more UC errors. Since newer UC errors don't
> overwrite older ones, clearing the banks allows us to see how many
> errors are piling up and being ignored.
>
> But running with tolerant==3 is likely to end in tears ... should we erase
> the evidence on what bad things happened?
It probably doesn't make a huge difference since you're not supposed
to run with tolerant=3, but I kind of understood the logic to be that
if we're going to keep running, we need to clear the banks, and if
we're going to crash, we need to leave them intact so whatever runs
next gets a chance to look at them. So with tolerant==3, we are going
to continue running, and I think for debugging purposes, it's useful
to see how many additional errors are happening.
Havard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists