[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250507012145.2998143-1-sohil.mehta@intel.com>
Date: Tue, 6 May 2025 18:21:36 -0700
From: Sohil Mehta <sohil.mehta@...el.com>
To: x86@...nel.org,
linux-kernel@...r.kernel.org
Cc: Xin Li <xin@...or.com>,
"H . Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Sean Christopherson <seanjc@...gle.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Zhang Rui <rui.zhang@...el.com>,
Lukasz Luba <lukasz.luba@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Sohil Mehta <sohil.mehta@...el.com>,
Brian Gerst <brgerst@...il.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Jacob Pan <jacob.pan@...ux.microsoft.com>,
Andi Kleen <ak@...ux.intel.com>,
Kai Huang <kai.huang@...el.com>,
Nikolay Borisov <nik.borisov@...e.com>,
linux-perf-users@...r.kernel.org,
linux-edac@...r.kernel.org,
kvm@...r.kernel.org,
linux-pm@...r.kernel.org,
linux-trace-kernel@...r.kernel.org
Subject: [PATCH v5 0/9] x86: Add support for NMI-source reporting with FRED
Introduction
============
NMI-source reporting with FRED [1] provides a new mechanism for
identifying the source of NMIs. As part of the FRED event delivery
framework, a 16-bit vector bitmap is provided that identifies one or
more sources that caused the NMI.
Using the source bitmap, the kernel can precisely run the relevant NMI
handlers instead of polling the entire NMI handler list. Additionally,
the source information would be invaluable for debugging misbehaving
handlers and unknown NMIs.
Changes since the last version
==============================
v4: https://lore.kernel.org/lkml/20240709143906.1040477-1-jacob.jun.pan@linux.intel.com/
Apart from the change of personnel, the patches include the following major
changes:
* Reorder the patches to have the infrastructure changes precede the
feature addition. (Sean)
* Use a simplified encoding mechanism for NMI-source vectors. (Sean)
* Get rid of the alternate NMI vector priority scheme. (below)
* Simplify NMI handling logic with source bitmap. (below)
Existing NMI handling code already has a priority mechanism for the NMI
handlers, with CPU-specific (NMI_LOCAL) handlers executed first followed
by platform NMI handlers and unknown NMI (NMI_UNKNOWN) handlers being
last. Within each of these NMI types, the handlers registered with
NMI_FLAG_FIRST are given priority.
It is essential that new NMI-source handling follows the same scheme to
maintain consistent behavior with and without NMI-source. If there is a
need for a more granular priority scheme, it should be introduced at the
generic NMI handler level instead of assigning priorities to NMI-source
vectors.
This design choice leads to a simplification in the NMI handling logic
as well. It is now possible to get rid of the complexity introduced by a
new handler lookup table as well as the partial bitmap handling logic.
The updated code (patch 5) is significantly less intrusive and easier to
maintain.
Day in the life of an NMI-source vector
=======================================
A brief overview of how NMI-source vectors are used:
// Allocate a static source vector at compile time
#define NMIS_VECTOR_TEST 1
// Register an NMI handler with the vector
register_nmi_handler(NMI_LOCAL, test_handler, 0, "nmi_test", NMIS_VECTOR_TEST);
// Generate an NMI with the source vector using NMI encoded delivery
__apic_send_IPI_mask(cpumask, APIC_DM_NMI | NMIS_VECTOR_TEST);
// Handle an NMI with or without the source information (oversimplified)
source_bitmap = fred_event_data(regs);
if (!source_bitmap || (source_bitmap & BIT(NMIS_VECTOR_TEST)))
test_handler();
// Unregister handler along with the vector
unregister_nmi_handler(NMI_LOCAL, "nmi_test");
Patch structure
===============
The patches are based on tip:x86/nmi because they depend on the NMI
cleanup series merged earlier [2].
Patch 1-2: Prepare FRED/KVM and enumerate NMI-source reporting
Patch 3-5: Register and handle NMI-source vectors
Patch 6-8: APIC changes to generate NMIs with vectors
Patch 9: Improve trace and debug with NMI-source information
Many thanks to Sean Christopherson, Xin Li, H. Peter Anvin, Andi Kleen,
Tony Luck, Kan Liang, Jacob Pan Jun, Zeng Guang and others for their
contributions, reviews and feedback.
Future work / Opens
===================
I am considering a few additional changes that would be valuable for
enhancing NMI handling support. Any feedback, preferences or suggestions
on the following would be helpful.
Assigning more NMI-source vectors
---------------------------------
The current patches assign NMI vectors to a limited number of sources.
The microcode rendezvous and crash reboot code use NMI but do not go
through the typical register_nmi_handler() path. Their handling is
special-cased in exc_nmi(). To isolate blame and improve debugging, it
would be useful to assign vectors to them, even if the vectors are
ignored during handling.
Other NMI sources, such as GHES and Platform NMIs, can also be assigned
vectors to speed up their NMI handling and improve isolation. However,
this would require a software/hardware agreement on vector reservation
and usage. Such an endeavor is likely not worth the effort.
Explicitly enabling NMIs
------------------------
HPA brought up the idea [3] of explicitly enabling NMIs only when the
kernel is ready to take them. With FRED, if we enter the kernel with
NMIs disabled, they could remain disabled until returning back to
userspace.
DebugFS support
---------------
Currently, the kernel has counters for unknown NMIs, swallowed NMIs and
other NMI handling data. However, there is no easy way to access that.
To identify issues that happen over a longer timeframe, it might be
useful to add DebugFS support for NMI statistics.
KVM support
-----------
The NMI-source feature can be useful for perf users and other NMI use
cases in guest VMs. Exposing NMI-source to guests once FRED support is
in place should be relatively easier. The prototype code for this is
under evaluation.
Links
=====
[1]: Chapter 9, https://www.intel.com/content/www/us/en/content-details/819481/flexible-return-and-event-delivery-fred-specification.html
[2]: https://lore.kernel.org/lkml/20250327234629.3953536-1-sohil.mehta@intel.com/
[3]: https://lore.kernel.org/lkml/F5D36889-A868-46D2-A678-8EE26E28556D@zytor.com/
Jacob Pan (1):
perf/x86: Enable NMI-source reporting for perfmon
Sohil Mehta (7):
x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting
x86/nmi: Extend the registration interface to include the NMI-source
vector
x86/nmi: Assign and register NMI-source vectors
x86/nmi: Add support to handle NMIs with source information
x86/nmi: Prepare for the new NMI-source vector encoding
x86/nmi: Enable NMI-source for IPIs delivered as NMIs
x86/nmi: Include NMI-source information in tracepoint and debug prints
Zeng Guang (1):
x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM
arch/x86/entry/entry_64_fred.S | 2 +-
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/core.c | 6 ++--
arch/x86/events/intel/core.c | 6 ++--
arch/x86/include/asm/apic.h | 38 ++++++++++++++++++++++
arch/x86/include/asm/apicdef.h | 2 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/fred.h | 9 +++---
arch/x86/include/asm/nmi.h | 37 ++++++++++++++++++++-
arch/x86/kernel/apic/hw_nmi.c | 5 ++-
arch/x86/kernel/apic/ipi.c | 4 +--
arch/x86/kernel/apic/local.h | 24 +++++++-------
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/mce/inject.c | 4 +--
arch/x86/kernel/cpu/mshyperv.c | 3 +-
arch/x86/kernel/kgdb.c | 8 ++---
arch/x86/kernel/kvm.c | 9 +-----
arch/x86/kernel/nmi.c | 50 ++++++++++++++++++++++++++++-
arch/x86/kernel/nmi_selftest.c | 9 +++---
arch/x86/kernel/smp.c | 6 ++--
arch/x86/kvm/vmx/vmx.c | 5 +--
arch/x86/platform/uv/uv_nmi.c | 4 +--
drivers/acpi/apei/ghes.c | 2 +-
drivers/char/ipmi/ipmi_watchdog.c | 3 +-
drivers/edac/igen6_edac.c | 3 +-
drivers/thermal/intel/therm_throt.c | 2 +-
drivers/watchdog/hpwdt.c | 6 ++--
include/trace/events/nmi.h | 13 +++++---
28 files changed, 190 insertions(+), 74 deletions(-)
base-commit: f2e01dcf6df2d12e86c363ea9c37d53994d89dd6
--
2.43.0
Powered by blists - more mailing lists