[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190516172117.GC21857@zn.tnic>
Date: Thu, 16 May 2019 19:21:17 +0200
From: Borislav Petkov <bp@...en8.de>
To: "Ghannam, Yazen" <Yazen.Ghannam@....com>
Cc: "Luck, Tony" <tony.luck@...el.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in
hardware
On Thu, May 16, 2019 at 05:09:11PM +0000, Ghannam, Yazen wrote:
> So that the sysfs files show the control values that are set in the
> hardware. It seemed like this would be more helpful than showing all
> 0xF's.
Yeah, but it has been like that since forever and it hasn't bugged
anybody. Probably because anybody doesn't even look at those files. As
Tony says:
"RAS is a lonely subsystem ... even EDAC gets more love."
:-)))
And adding yet another vendor check for this seemed just not worth it.
> Should I send out another version of this set?
I simply zapped 5/6. I still think your 6/6 makes sense though.
---
From: Yazen Ghannam <yazen.ghannam@....com>
Date: Tue, 30 Apr 2019 20:32:21 +0000
Subject: [PATCH] x86/MCE: Determine MCA banks' init state properly
The OS is expected to write all bits to MCA_CTL for each bank,
thus enabling error reporting in all banks. However, some banks
may be unused in which case the registers for such banks are
Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control
bits because of quirks, etc.
A bank can be considered uninitialized if the MCA_CTL register returns
zero. This is because either the OS did not write anything or because
the hardware is enforcing RAZ/WI for the bank.
Set a bank's init value based on if the control bits are set or not in
hardware. Return an error code in the sysfs interface for uninitialized
banks.
[ bp: Massage a bit. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
Signed-off-by: Borislav Petkov <bp@...e.de>
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Tony Luck <tony.luck@...el.com>
Cc: "x86@...nel.org" <x86@...nel.org>
Link: https://lkml.kernel.org/r/20190430203206.104163-7-Yazen.Ghannam@amd.com
---
arch/x86/kernel/cpu/mce/core.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 5bcecadcf4d9..c049689f3d73 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1567,10 +1567,13 @@ static void __mcheck_cpu_init_clear_banks(void)
for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
struct mce_bank *b = &mce_banks[i];
- if (!b->init)
- continue;
- wrmsrl(msr_ops.ctl(i), b->ctl);
- wrmsrl(msr_ops.status(i), 0);
+ if (b->init) {
+ wrmsrl(msr_ops.ctl(i), b->ctl);
+ wrmsrl(msr_ops.status(i), 0);
+ }
+
+ /* Bank is initialized if bits are set in hardware. */
+ b->init = !!b->ctl;
}
}
@@ -2095,6 +2098,9 @@ static ssize_t show_bank(struct device *s, struct device_attribute *attr,
b = &per_cpu(mce_banks_array, s->id)[bank];
+ if (!b->init)
+ return -ENODEV;
+
return sprintf(buf, "%llx\n", b->ctl);
}
@@ -2113,6 +2119,9 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,
b = &per_cpu(mce_banks_array, s->id)[bank];
+ if (!b->init)
+ return -ENODEV;
+
b->ctl = new;
mce_restart();
--
2.21.0
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists