[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150904115029.GA23550@nazgul.tnic>
Date: Fri, 4 Sep 2015 13:50:29 +0200
From: Borislav Petkov <bp@...en8.de>
To: Ashok Raj <ashok.raj@...el.com>
Cc: linux-kernel@...r.kernel.org, Boris Petkov <bp@...e.de>,
linux-edac@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
Serge Ayoun <serge.ayoun@...el.com>
Subject: Re: [Patch V0] x86, mce: Don't clear global error reporting banks
during cpu_offline
On Thu, Sep 03, 2015 at 02:17:04PM -0400, Ashok Raj wrote:
> During CPU offline, or during suspend/resume operations, its not safe to
> clear MCi_CTL. These MSR's are either thread scoped (meaning private to
> thread), or core scoped (private to threads in that core only), or socket
> scope i.e visible and controllable from all threads in the socket.
>
> When we turn off during CPU_OFFLINE, just offlining a single CPU will
> stop signaling for all the socket wide resources, such as LLC, iMC for e.g.
>
> It is true for Intel CPU's. But there seems some history that other processors
> may require to turn these off during every CPU offline.
>
> Intel Secure Guard eXtentions will be disabled when these controls are cleared
> from a security perspective. This patch enables SGX to work across
> suspend/resume.
What does that mean? What does SGX have to do with MCI_CTL registers?
Explain that in the commit message so that !Intel people can understand.
> - Consolidated some code to use sharing
> - Minor changes to some prototypes to fit usage.
> - Left handling same for non-Intel CPU models to avoid any unknown regressions.
For the whole patch text do:
s/cpu/CPU/
s/CPU's/CPUs/ and s/MSR's/MSRs/ if you mean plural. Also spellcheck all text.
>
> Signed-off-by: Ashok Raj <ashok.raj@...el.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
> Tested-by: Serge Ayoun <serge.ayoun@...el.com>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 38 ++++++++++++++++++++++++++++----------
> 1 file changed, 28 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index d350858..5498a79 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -2100,7 +2100,7 @@ int __init mcheck_init(void)
> * Disable machine checks on suspend and shutdown. We can't really handle
> * them later.
> */
> -static int mce_disable_error_reporting(void)
> +static void mce_disable_error_reporting(void)
> {
> int i;
>
> @@ -2110,17 +2110,40 @@ static int mce_disable_error_reporting(void)
> if (b->init)
> wrmsrl(MSR_IA32_MCx_CTL(i), 0);
> }
> - return 0;
> + return;
> +}
> +
> +static void _vendor_disable_error_reporting(void)
Why the "_" prepended here?
> +{
> + struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> + switch (c->x86_vendor) {
> + case X86_VENDOR_INTEL:
> + /*
> + * Don't clear on Intel CPU's. Some of these MSR's are
> + * socket wide. Disabling them for just a single cpu offline
> + * is bad, since it will inhibit reporting for all shared
> + * resources.. such as LLC, iMC for e.g.
> + */
> + break;
> + default:
> + /*
> + * Disble MCE reporting for all other CPU Vendor.
> + * Don't want to break functionality on those
> + */
> + mce_disable_error_reporting();
> + }
I think the switch-case makes this unnecessarily bloated as code. Just
do:
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
return;
mce_disable_error_reporting();
...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists