[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161107184532.xj6wzdjlzwhshcmf@pd.tnic>
Date: Mon, 7 Nov 2016 19:45:32 +0100
From: Borislav Petkov <bp@...en8.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Tony Luck <tony.luck@...el.com>
Cc: linux-kernel@...r.kernel.org, rt@...utronix.de,
Tony Luck <tony.luck@...el.com>, linux-edac@...r.kernel.org,
x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 22/25] x86/mcheck: Do the init in one place
On Thu, Nov 03, 2016 at 03:50:18PM +0100, Sebastian Andrzej Siewior wrote:
> Part of the init (memory allocation and so on) is done
> in mcheck_cpu_init(). While moving the the allocation to
> mcheck_init_device() (where the hotplug calls are initialized) it
> becomes necessary to move the callback (mcheck_cpu_init()), too.
>
> The callback is now removed from identify_cpu() and registered as a
> hotplug event which is invoked as the very first one which is shortly
> after the original point of invocation (look at smp_store_cpu_info() and
> notify_cpu_starting() in smp_callin()).
> One "visible" difference is that MCE for the boot CPU is not enabled at
> identify_boot_cpu() time but at device_initcall_sync() time. Either way,
> both times we had no userland around.
Uh, hm, I'm not sure about this: so the issue I see with this is that
the more we're delaying the enabling or MCE reporting - and especially
setting CR4[MCE] - the more we're increasing the window where a MCE
during early boot will cause a shutdown. (This is what happens if
CR4[MCE]=0b).
Perhaps we should split the init into a very early init which doesn't
need to be part of hotplug and the rest, which can do mce_disable_cpu()
and mce_reenable_cpu().
Tony, how do you see this?
> Cc: Tony Luck <tony.luck@...el.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: linux-edac@...r.kernel.org
> Cc: x86@...nel.org
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> ---
...
> @@ -2584,11 +2580,26 @@ static __init int mcheck_init_device(void)
> goto err_out;
> }
>
> + err = __mcheck_cpu_mce_banks_init();
^^^^^^^^
I guess you can merge this one...
> + if (err)
> + goto err_out_mem;
> +
> mce_init_banks();
^^^^^^^^
into this one now.
But let's sort out the bigger issue first.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists