[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211207195204.1582-1-eric.devolder@oracle.com>
Date: Tue, 7 Dec 2021 14:51:58 -0500
From: Eric DeVolder <eric.devolder@...cle.com>
To: linux-kernel@...r.kernel.org, x86@...nel.org,
kexec@...ts.infradead.org, ebiederm@...ssion.com,
dyoung@...hat.com, bhe@...hat.com, vgoyal@...hat.com
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com,
nramas@...ux.microsoft.com, thomas.lendacky@....com,
robh@...nel.org, efault@....de, rppt@...nel.org,
konrad.wilk@...cle.com, boris.ostrovsky@...cle.com,
eric.devolder@...cle.com
Subject: [RFC v2 0/6] crash: Kernel handling of CPU and memory hot un/plug
When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr (for x86), which describes the CPUs
and memory in the system, must also be updated, else the resulting
vmcore is inaccurate (eg. missing either CPU context or memory
regions).
The current solution utilizes udev to initiate an unload-then-reload
of the kdump image (e. kernel, initrd, boot_params, puratory and
elfcorehdr) by the userspace kexec utility. In previous posts I have
outlined the significant performance problems related to offloading
this activity to userspace.
This patchset introduces a generic crash hot un/plug handler that
registers with the CPU and memory notifiers. Upon CPU or memory
changes, this generic handler is invoked and performs important
housekeeping, for example obtaining the appropriate lock, and then
invokes an architecture specific handler to do the appropriate
updates.
In the case of x86_64, the arch specific handler generates a new
elfcorehdr, and overwrites the old one in memory. No involvement
with userspace needed.
To realize the benefits/test this patchset, one must make a couple
of minor changes to userspace:
- Disable the udev rule for updating kdump on hot un/plug changes
Eg. on RHEL: rm -f /usr/lib/udev/rules.d/98-kexec.rules
or other technique to neuter the rule.
- Change to the kexec_file_load for loading the kdump kernel:
Eg. on RHEL: in /usr/bin/kdumpctl, change to:
standard_kexec_args="-p -d -s"
which adds the -s to select kexec_file_load syscall.
This patchset supports kexec_load with a modified kexec userspace
utility, on which I am current working to provide separately.
Regards,
eric
---
RFC v2: 7dec2021
- Acting upon Baoquan He suggestion of removing elfcorehdr from
the purgatory list of segments, removed purgatory code from
patchset, and it is signficiantly simpler now.
RFC v1: 18nov2021
https://lkml.org/lkml/2021/11/18/845
- working patchset demonstrating kernel handling of hotplug
updates to x86 elfcorehdr for kexec_file_load
RFC: 14dec2020
https://lkml.org/lkml/2020/12/14/532
- proposed concept of allowing kernel to handle hotplug update
of elfcorehdr
---
Eric DeVolder (6):
crash: fix minor typo/bug in debug message
crash hp: Introduce CRASH_HOTPLUG configuration options
crash hp: definitions and prototype changes
crash hp: generic crash hotplug support infrastructure
crash hp: kexec_file changes for crash hotplug support
crash hp: Add x86 crash hotplug support
arch/x86/Kconfig | 26 ++++++++
arch/x86/kernel/crash.c | 140 +++++++++++++++++++++++++++++++++++++++-
include/linux/kexec.h | 21 +++++-
kernel/crash_core.c | 118 +++++++++++++++++++++++++++++++++
kernel/kexec_file.c | 15 ++++-
5 files changed, 314 insertions(+), 6 deletions(-)
--
2.27.0
Powered by blists - more mailing lists