[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F192F2DD6@ORSMSX104.amr.corp.intel.com>
Date: Wed, 23 May 2012 17:01:54 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Chen Gong <gong.chen@...ux.intel.com>
CC: "bp@...64.org" <bp@...64.org>, "x86@...nel.org" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: RE: [PATCH] x86: auto poll/interrupt mode switch for CMC to stop
CMC storm
> What's the point of doing this work? Why can't we just do that on the
> CPU which got hit by the MCE storm and leave the others alone? They
> either detect it themself or are just not affected.
CMCI gets broadcast to all threads on a socket. So
if one cpu has a problem, many cpus have a problem :-(
Some machine check banks are local to a thread/core,
so we need to make sure that the CMCI gets taken by
someone who can actually see the bank with the problem.
The others are collateral damage - but this means there
is even more reason to do something about a CMCI storm
as the effects are not localized.
> What's wrong with doing that strictly per cpu and avoid the whole
> global state horror?
Is that less of a horror? We'd have some cpus polling and some
taking CMCI (in somewhat arbitrary and ever changing combinations).
I'm not sure which is less bad.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists