[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170405133850.ihfyvle3sc5p4yer@pd.tnic>
Date: Wed, 5 Apr 2017 15:39:42 +0200
From: Borislav Petkov <bp@...en8.de>
To: Yazen Ghannam <Yazen.Ghannam@....com>
Cc: linux-edac@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA MCA_DE{STAT,ADDR}
registers
On Tue, Apr 04, 2017 at 12:24:31PM -0500, Yazen Ghannam wrote:
> From: Yazen Ghannam <yazen.ghannam@....com>
>
> We have support for the new SMCA MCA_DE{STAT,ADDR} registers in Linux. So
> we've used these registers in place of MCA_{STATUS,ADDR} on SMCA systems.
> However, the guidance for current implementations of SMCA is to continue
> using MCA_{STATUS,ADDR} and to use MCA_DE{STAT,ADDR} only if a Deferred
> error was not found in the former registers. This also means we shouldn't
> clear MCA_CONFIG[LogDeferredInMcaStat].
Does that mean we should trust what BIOS has set it to?
>
> Redo the AMD Deferred error interrupt handler to follow the guidance for
> current SMCA systems. Also, don't break after finding the first error.
>
> Rework __log_error() for clarity.
>
> Don't clear MCA_CONFIG[LogDeferredInMcaStat] during AMD mcheck init.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> ---
...
> @@ -832,25 +821,29 @@ asmlinkage __visible void __irq_entry smp_trace_deferred_error_interrupt(void)
> exiting_ack_irq();
> }
>
> +static inline bool is_deferred_error(u64 status)
> +{
> + return ((status & MCI_STATUS_VAL) && (status & MCI_STATUS_DEFERRED));
> +}
> +
> /* APIC interrupt handler for deferred errors */
> static void amd_deferred_error_interrupt(void)
> {
> unsigned int bank;
> - u32 msr_status;
> u64 status;
>
> for (bank = 0; bank < mca_cfg.banks; ++bank) {
> - msr_status = (mce_flags.smca) ? MSR_AMD64_SMCA_MCx_DESTAT(bank)
> - : msr_ops.status(bank);
> + rdmsrl(msr_ops.status(bank), status);
>
> - rdmsrl(msr_status, status);
> + if (is_deferred_error(status)) {
> + __log_error_deferred(bank, false);
>
> - if (!(status & MCI_STATUS_VAL) ||
> - !(status & MCI_STATUS_DEFERRED))
> - continue;
> + } else if (mce_flags.smca) {
> + rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), status);
>
> - __log_error(bank, true, false, 0);
> - break;
> + if (is_deferred_error(status))
> + __log_error_deferred(bank, true);
> + }
The diff is very hard to parse, lemme paste the whole resulting function
after the patch:
> /* APIC interrupt handler for deferred errors */
> static void amd_deferred_error_interrupt(void)
> {
> unsigned int bank;
> u64 status;
>
> for (bank = 0; bank < mca_cfg.banks; ++bank) {
> rdmsrl(msr_ops.status(bank), status);
>
> if (is_deferred_error(status)) {
> __log_error_deferred(bank, false);
>
> } else if (mce_flags.smca) {
> rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), status);
>
> if (is_deferred_error(status))
> __log_error_deferred(bank, true);
So this doesn't sound like the guidance: if we should fallback to the
DE* versions only if a deferred error wasn't found in the usual regs,
the logic should be:
static void __log_error_deferred(unsigned int bank)
{
u32 msr_stat = msr_ops.status(bank);
u32 msr_addr = msr_ops.addr(bank);
/*
* __log_error() needs to return true to say it has logged successfully or false
* if it hasn't.
*/
if (__log_error(bank, msr_stat, msr_addr, 0))
goto out;
if (!mce_flags.smca)
return;
/*
* Couldn't log a deferred error from the usual regs, fallback to DE*
* variants.
*/
msr_stat = MSR_AMD64_SMCA_MCx_DESTAT(bank);
msr_addr = MSR_AMD64_SMCA_MCx_DEADDR(bank);
__log_error(bank, msr_stat, msr_addr, 0);
out:
/* We should still clear MCA_DESTAT even if we used MCA_STATUS. */
if (mce_flags.smca)
wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0);
}
and this way you don't need to wiggle around that use_smca_destat thing.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists