[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZFiE/TXDtrt/y73w@MiWiFi-R3L-srv>
Date: Mon, 8 May 2023 13:13:33 +0800
From: Baoquan He <bhe@...hat.com>
To: Eric DeVolder <eric.devolder@...cle.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
kexec@...ts.infradead.org, ebiederm@...ssion.com,
dyoung@...hat.com, vgoyal@...hat.com, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, nramas@...ux.microsoft.com, thomas.lendacky@....com,
robh@...nel.org, efault@....de, rppt@...nel.org, david@...hat.com,
sourabhjain@...ux.ibm.com, konrad.wilk@...cle.com,
boris.ostrovsky@...cle.com
Subject: Re: [PATCH v22 6/8] crash: hotplug support for kexec_load()
On 05/03/23 at 06:41pm, Eric DeVolder wrote:
> The hotplug support for kexec_load() requires coordination with
> userspace, and therefore a little extra help from the kernel to
> facilitate the coordination.
>
> In the absence of the solution contained within this particular
> patch, if a kdump capture kernel is loaded via kexec_load() syscall,
> then the crash hotplug logic would find the segment containing the
> elfcorehdr, and upon a hotplug event, rewrite the elfcorehdr. While
> generally speaking that is the desired behavior and outcome, a
> problem arises from the fact that if the kdump image includes a
> purgatory that performs a digest checksum, then that check would
> fail (because the elfcorehdr was changed), and the capture kernel
> would fail to boot and no kdump occur.
>
> Therefore, what is needed is for the userspace kexec-tools to
> indicate to the kernel whether or not the supplied kdump image/
> elfcorehdr can be modified (because the kexec-tools excludes the
> elfcorehdr from the digest, and sizes the elfcorehdr memory buffer
> appropriately).
>
> To solve these problems, this patch introduces:
> - a new kexec flag KEXEC_UPATE_ELFCOREHDR to indicate that it is
> safe for the kernel to modify the elfcorehdr (because kexec-tools
> has excluded the elfcorehdr from the digest).
> - the /sys/kernel/crash_elfcorehdr_size node to communicate to
> kexec-tools what the preferred size of the elfcorehdr memory buffer
> should be in order to accommodate hotplug changes.
> - The sysfs crash_hotplug nodes (ie.
> /sys/devices/system/[cpu|memory]/crash_hotplug) are now dynamic in
> that they examine kexec_file_load() vs kexec_load(), and when
> kexec_load(), whether or not KEXEC_UPDATE_ELFCOREHDR is in effect.
> This is critical so that the udev rule processing of crash_hotplug
> indicates correctly (ie. the userspace unload-then-load of the
> kdump of the kdump image can be skipped, or not).
>
> With this patch in place, I believe the following statements to be true
> (with local testing to verify):
>
> - For systems which have these kernel changes in place, but not the
> corresponding changes to the crash hot plug udev rules and
> kexec-tools, (ie "older" systems) those systems will continue to
> unload-then-load the kdump image, as has always been done. The
> kexec-tools will not set KEXEC_UPDATE_ELFCOREHDR.
> - For systems which have these kernel changes in place and the proposed
> udev rule changes in place, but not the kexec-tools changes in place:
> - the use of kexec_load() will not set KEXEC_UPDATE_ELFCOREHDR and
> so the unload-then-reload of kdump image will occur (the sysfs
> crash_hotplug nodes will show 0).
> - the use of kexec_file_load() will permit sysfs crash_hotplug nodes
> to show 1, and the kernel will modify the elfcorehdr directly. And
> with the udev changes in place, the unload-then-load will not occur!
> - For systems which have these kernel changes as well as the udev and
> kexec-tools changes in place, then the user/admin has full authority
> over the enablement and support of crash hotplug support, whether via
> kexec_file_load() or kexec_load().
>
> Said differently, as kexec_load() was/is widely in use, these changes
> permit it to continue to be used as-is (retaining the current unload-then-
> reload behavior) until such time as the udev and kexec-tools changes can
> be rolled out as well.
>
> I've intentionally kept the changes related to userspace coordination
> for kexec_load() separate as this need was identified late; the
> rest of this series has been generally reviewed and accepted. Once
> this support has been vetted, I can refactor if needed.
>
> Suggested-by: Hari Bathini <hbathini@...ux.ibm.com>
> Signed-off-by: Eric DeVolder <eric.devolder@...cle.com>
LGTM,
Acked-by: Baoquan He <bhe@...hat.com>
Powered by blists - more mailing lists