[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1480511979-11722-1-git-send-email-prarit@redhat.com>
Date: Wed, 30 Nov 2016 08:19:39 -0500
From: Prarit Bhargava <prarit@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Prarit Bhargava <prarit@...hat.com>, Borislav Petkov <bp@...e.de>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Len Brown <lenb@...nel.org>,
Paul Gortmaker <paul.gortmaker@...driver.com>,
Tyler Baicar <tbaicar@...eaurora.org>,
Punit Agrawal <punit.agrawal@....com>,
Don Zickus <dzickus@...hat.com>
Subject: [PATCH v2] ACPI / APEI: Fix NMI notification handling
When removing and adding cpu 0 on a system with GHES NMI the following stack
trace is seen when re-adding the cpu:
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
Call Trace:
dump_stack+0x63/0x8e
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
setup_local_APIC+0x275/0x370
apic_ap_setup+0xe/0x20
start_secondary+0x48/0x180
set_init_arg+0x55/0x55
early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x13d/0x14c
During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the
ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
(0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also
0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
that something has set the IRQ_WORK_VECTOR line prior to the APIC being
initialized.
Commit 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler")
incorrectly modified the behavior such that the handler returns
NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
work queue for every NMI.
This patch modifies the ghes_proc_irq_work() to run as it did prior to
2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") by
properly returning NMI_HANDLED and only calling the work queue if
NMI_HANDLED has been set.
v2: Borislav, setting of NMI_HANDLED moved & cleaned up changelog.
Fixes: 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler")
Signed-off-by: Prarit Bhargava <prarit@...hat.com>
Cc: Borislav Petkov <bp@...e.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
Cc: Len Brown <lenb@...nel.org>
Cc: Paul Gortmaker <paul.gortmaker@...driver.com>
Cc: Tyler Baicar <tbaicar@...eaurora.org>
Cc: Punit Agrawal <punit.agrawal@....com>
Cc: Don Zickus <dzickus@...hat.com>
---
drivers/acpi/apei/ghes.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 0d099a24f776..e53bef6cf53c 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -852,6 +852,8 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
if (ghes_read_estatus(ghes, 1)) {
ghes_clear_estatus(ghes);
continue;
+ } else {
+ ret = NMI_HANDLED;
}
sev = ghes_severity(ghes->estatus->error_severity);
@@ -863,12 +865,11 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
__process_error(ghes);
ghes_clear_estatus(ghes);
-
- ret = NMI_HANDLED;
}
#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
- irq_work_queue(&ghes_proc_irq_work);
+ if (ret == NMI_HANDLED)
+ irq_work_queue(&ghes_proc_irq_work);
#endif
atomic_dec(&ghes_in_nmi);
return ret;
--
1.7.9.3
Powered by blists - more mailing lists