[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtkkIoUIu8shp/ut@MiWiFi-R3L-srv>
Date: Thu, 5 Sep 2024 11:23:14 +0800
From: Baoquan He <bhe@...hat.com>
To: Sourabh Jain <sourabhjain@...ux.ibm.com>
Cc: Michael Ellerman <mpe@...erman.id.au>,
Hari Bathini <hbathini@...ux.ibm.com>, kexec@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
x86@...nel.org, Sachin P Bappalige <sachinpb@...ux.vnet.ibm.com>
Subject: Re: [PATCH] kexec/crash: no crash update when kexec in progress
On 09/04/24 at 02:55pm, Sourabh Jain wrote:
> Hello Baoquan,
>
> On 30/08/24 16:47, Baoquan He wrote:
> > On 08/20/24 at 12:10pm, Sourabh Jain wrote:
> > > Hello Baoquan,
> > >
......snip...
> > > 2. A patch to return early from the `crash_handle_hotplug_event()` function
> > > if `kexec_in_progress` is
> > > set to True. This is essentially my original patch.
> > There's a race gap between the kexec_in_progress checking and the
> > setting it to true which Michael has mentioned.
>
> The window where kernel is holding kexec_lock to do kexec boot
> but kexec_in_progress is yet not set to True.
>
> If kernel needs to handle crash hotplug event, the function
> crash_handle_hotplug_event() will not get the kexec_lock and
> error out by printing error message about not able to update
> kdump image.
But you wanted to avoid the erroring out if it's being in
kernel_kexec(). Now you are seeing at least one the noising
message, aren't you?
>
> I think it should be fine. Given that lock is already taken for
> kexec kernel boot.
>
> Am I missing something major?
>
> > That's why I think
> > maybe checking kexec_in_progress after failing to retriving
> > __kexec_lock is a little better, not very sure.
>
> Try for kexec lock before kexec_in_progress check will not solve
> the original problem this patch trying to solve.
>
> You proposed the below changes earlier:
>
> - if (!kexec_trylock()) {
> + if (!kexec_trylock() && kexec_in_progress) {
> pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> crash_hotplug_unlock();
Ah, I meant as below, but wrote it mistakenly.
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 63cf89393c6e..e7c7aa761f46 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -504,7 +504,7 @@ int crash_check_hotplug_support(void)
crash_hotplug_lock();
/* Obtain lock while reading crash information */
- if (!kexec_trylock()) {
+ if (!kexec_trylock() && !kexec_in_progress) {
pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
crash_hotplug_unlock();
return 0;
>
>
> Once the kexec_in_progress is set to True there is no way one can get
> kexec_lock. So kexec_trylock() before kexec_in_progress is not helpful
> for the problem I am trying to solve.
With your patch, you could still get the error message if the race gap
exist. With above change, you won't get it. Please correct me if I am
wrong.
Powered by blists - more mailing lists