[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170118203413.go4eh27gaxzcbf4k@pd.tnic>
Date: Wed, 18 Jan 2017 21:34:13 +0100
From: Borislav Petkov <bp@...en8.de>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: "Luck, Tony" <tony.luck@...el.com>,
linux-edac <linux-edac@...r.kernel.org>, X86 ML <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/MCE: Remove MCP_TIMESTAMP
On Wed, Nov 09, 2016 at 07:06:04PM +0100, Borislav Petkov wrote:
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 4ca00474804b..3fbe80066b4e 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -706,6 +706,17 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
>
> mce_gather_info(&m, NULL);
>
> + /*
> + * m.tsc was set in mce_setup(). Clear it if not requested.
> + *
> + * FIXME: Propagate @flags to mce_gather_info/mce_setup() to avoid
> + * that dance.
> + */
Lemme try to address that FIXME. So it doesn't look particularly easy to
do without touching/changing a bunch of places. That's why I'm trying
tricks first.
For example, the mce-apei.c case I'm addressing by setting ->tsc only
for errors of panic severity. The idea there is, that, panic errors will
have raised an #MC and not polled.
And then instead of propagating a flag to mce_setup(), it seems
easier/less code to set ->tsc depending on the call sites, i.e.,
are we polling or are we preparing an MCE record in an exception
handler/thresholding interrupt.
I'll run it tomorrow.
---
diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index 83f1a98d37db..2eee85379689 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -52,8 +52,11 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
if (severity >= GHES_SEV_RECOVERABLE)
m.status |= MCI_STATUS_UC;
- if (severity >= GHES_SEV_PANIC)
+
+ if (severity >= GHES_SEV_PANIC) {
m.status |= MCI_STATUS_PCC;
+ m.tsc = rdtsc();
+ }
m.addr = mem_err->physical_addr;
mce_log(&m);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6eef6fde0f02..ca15a7e1f97d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -128,7 +128,6 @@ void mce_setup(struct mce *m)
{
memset(m, 0, sizeof(struct mce));
m->cpu = m->extcpu = smp_processor_id();
- m->tsc = rdtsc();
/* We hope get_seconds stays lockless */
m->time = get_seconds();
m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -710,14 +709,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
mce_gather_info(&m, NULL);
- /*
- * m.tsc was set in mce_setup(). Clear it if not requested.
- *
- * FIXME: Propagate @flags to mce_gather_info/mce_setup() to avoid
- * that dance.
- */
- if (!(flags & MCP_TIMESTAMP))
- m.tsc = 0;
+ if (flags & MCP_TIMESTAMP)
+ m.tsc = rdtsc();
for (i = 0; i < mca_cfg.banks; i++) {
if (!mce_banks[i].ctl || !test_bit(i, *b))
@@ -1156,6 +1149,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
goto out;
mce_gather_info(&m, regs);
+ m.tsc = rdtsc();
final = this_cpu_ptr(&mces_seen);
*final = m;
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 776379e4a39c..9e5427df3243 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -778,7 +778,8 @@ __log_error(unsigned int bank, bool deferred_err, bool threshold_err, u64 misc)
mce_setup(&m);
m.status = status;
- m.bank = bank;
+ m.bank = bank;
+ m.tsc = rdtsc();
if (threshold_err)
m.misc = misc;
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists