[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140620152312.GB11391@pd.tnic>
Date: Fri, 20 Jun 2014 17:23:12 +0200
From: Borislav Petkov <bp@...en8.de>
To: Boris Ostrovsky <boris.ostrovsky@...cle.com>
Cc: tony.luck@...el.com, linux-kernel@...r.kernel.org,
linux-edac@...r.kernel.org, mattieu.souchaud@...e.fr
Subject: Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error
path
On Fri, Jun 20, 2014 at 10:28:13AM -0400, Boris Ostrovsky wrote:
> Commit 9c15a24b038f4d8da93a2bc2554731f8953a7c17 (x86/mce: Improve
> mcheck_init_device() error handling) unregisters (or never registers)
> MCE's hotplug notifier if an error is encountered.
Well, mcheck_init_device() did encounter errors before that commit too,
can you please go into detail on how exactly you're triggering this?
Which error are you talking about exactly?
Lemme guess: some xen special handling which baremetal doesn't need.
> Since unplugging a CPU would normally result in the notifier deleting
> MCE timer we are now left with the timer running if a CPU is removed on
> a system where mcheck_init_device() had failed.
>
> If we later hotplug this CPU back we add this timer again in
> mcheck_cpu_init()). Eventually the two timers start intefering with each
> other, causing soft lockups or system hangs.
>
> We should leave the notifier always on and, in fact, set it up early
> during the boot.
We do leave it always on - we only unregister it if we've encountered an
error.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists