[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150904115029.GA23550@nazgul.tnic>
Date:	Fri, 4 Sep 2015 13:50:29 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Ashok Raj <ashok.raj@...el.com>
Cc:	linux-kernel@...r.kernel.org, Boris Petkov <bp@...e.de>,
	linux-edac@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
	Serge Ayoun <serge.ayoun@...el.com>
Subject: Re: [Patch V0] x86, mce: Don't clear global error reporting banks
 during cpu_offline
On Thu, Sep 03, 2015 at 02:17:04PM -0400, Ashok Raj wrote:
> During CPU offline, or during suspend/resume operations, its not safe to
> clear MCi_CTL. These MSR's are either thread scoped (meaning private to
> thread), or core scoped (private to threads in that core only), or socket
> scope i.e visible and controllable from all threads in the socket.
> 
> When we turn off during CPU_OFFLINE, just offlining a single CPU will
> stop signaling for all the socket wide resources, such as LLC, iMC for e.g.
> 
> It is true for Intel CPU's. But there seems some history that other processors
> may require to turn these off during every CPU offline.
> 
> Intel Secure Guard eXtentions will be disabled when these controls are cleared
> from a security perspective. This patch enables SGX to work across
> suspend/resume.
What does that mean? What does SGX have to do with MCI_CTL registers?
Explain that in the commit message so that !Intel people can understand.
> - Consolidated some code to use sharing
> - Minor changes to some prototypes to fit usage.
> - Left handling same for non-Intel CPU models to avoid any unknown regressions.
For the whole patch text do:
s/cpu/CPU/
s/CPU's/CPUs/ and s/MSR's/MSRs/ if you mean plural. Also spellcheck all text.
> 
> Signed-off-by: Ashok Raj <ashok.raj@...el.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
> Tested-by: Serge Ayoun <serge.ayoun@...el.com>
> ---
>  arch/x86/kernel/cpu/mcheck/mce.c | 38 ++++++++++++++++++++++++++++----------
>  1 file changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index d350858..5498a79 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -2100,7 +2100,7 @@ int __init mcheck_init(void)
>   * Disable machine checks on suspend and shutdown. We can't really handle
>   * them later.
>   */
> -static int mce_disable_error_reporting(void)
> +static void mce_disable_error_reporting(void)
>  {
>  	int i;
>  
> @@ -2110,17 +2110,40 @@ static int mce_disable_error_reporting(void)
>  		if (b->init)
>  			wrmsrl(MSR_IA32_MCx_CTL(i), 0);
>  	}
> -	return 0;
> +	return;
> +}
> +
> +static void _vendor_disable_error_reporting(void)
Why the "_" prepended here?
> +{
> +	struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> +	switch (c->x86_vendor) {
> +	case X86_VENDOR_INTEL:
> +		/*
> +		 * Don't clear on Intel CPU's. Some of these MSR's are
> +		 * socket wide. Disabling them for just a single cpu offline
> +		 * is bad, since it will inhibit reporting for all shared
> +		 * resources.. such as LLC, iMC for e.g.
> +		 */
> +		break;
> +	default:
> +		/*
> +		 * Disble MCE reporting for all other CPU Vendor.
> +		 * Don't want to break functionality on those
> +		 */
> +		mce_disable_error_reporting();
> +	}
I think the switch-case makes this unnecessarily bloated as code. Just
do:
	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
		return;
	mce_disable_error_reporting();
...
-- 
Regards/Gruss,
    Boris.
ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
