[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1400486094.10554.3.camel@debian>
Date: Mon, 19 May 2014 15:54:54 +0800
From: Chen Yucong <slaoub@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: tony.luck@...el.com, linux-kernel@...r.kernel.org,
linux-edac@...r.kernel.org
Subject: Re: [PATCH] x86/mce: Distirbute the clear operation of mces_seen to
Per-CPU rather than only monarch CPU
On Mon, 2014-05-19 at 09:26 +0200, Borislav Petkov wrote:
> On Mon, May 19, 2014 at 09:55:40AM +0800, Chen Yucong wrote:
> > But all other CPUs also have to wait monarch CPU to exit from mce_end.
> > What's the difference between monarch CPU and Per-CPU for clearing
> > mces_seen? In practice, there is no difference between them. If we use
> > monarch CPU to clear mces_seen, then Per-CPU variable can not play out
> > its advantage.
>
> I'll let you stare at mce_reign() a little bit longer... Also, pay
> attention to its callsite, that might help.
>
We can find the following code segment in mce_end:
-----
...
if (order == 1) {
/* CHECKME: Can this race with a parallel hotplug? */
int cpus = num_online_cpus();
/*
* Monarch: Wait for everyone to go through their
scanning
* loops.
*/
while (atomic_read(&mce_executing) <= cpus) {
if (mce_timed_out(&timeout))
goto reset;
ndelay(SPINUNIT);
}
mce_reign();
barrier();
ret = 0;
...
-----
If a timeout occurs in monarch CPU, what will happen for the above code
segment?
The monarch CPU will directly execute -goto reset-, so mce_reign will
not be invoked. That way, the clear operation of mces_seen will be
skipped, and the stale value of mces_seen will reappear on the next mce.
thx!
cyc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists