[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111027174053.GK7491@redhat.com>
Date: Thu, 27 Oct 2011 13:40:53 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: Michael Holzheu <holzheu@...ux.vnet.ibm.com>
Cc: akpm@...ux-foundation.org,
"Eric W. Biederman" <ebiederm@...ssion.com>,
schwidefsky@...ibm.com, heiko.carstens@...ibm.com,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] kdump: Fix crash_kexec - smp_send_stop race in panic
On Wed, Oct 26, 2011 at 04:34:09PM +0200, Michael Holzheu wrote:
> Hello Andrew,
>
> After the discussion with Eric and Vivek the following patch
> seems to be a good solution to me. Could you accept this patch?
>
> When two CPUs call panic at the same time there is a
> possible race condition that can stop kdump. The first
> CPU calls crash_kexec() and the second CPU calls
> smp_send_stop() in panic() before crash_kexec() finished
> on the first CPU. So the second CPU stops the first CPU
> and therefore kdump fails:
>
> 1st CPU:
> panic()->crash_kexec()->mutex_trylock(&kexec_mutex)-> do kdump
>
> 2nd CPU:
> panic()->crash_kexec()->kexec_mutex already held by 1st CPU
> ->smp_send_stop()-> stop 1st CPU (stop kdump)
>
> This patch fixes the problem by introducing a spinlock in
> panic that allows only one CPU to process crash_kexec() and
> the subsequent panic code.
>
> Signed-off-by: Michael Holzheu <holzheu@...ux.vnet.ibm.com>
Sounds reasonable to me.
Acked-by: Vivek Goyal <vgoyal@...hat.com>
Thanks
Vivek
> ---
> kernel/panic.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -59,6 +59,7 @@ EXPORT_SYMBOL(panic_blink);
> */
> NORET_TYPE void panic(const char * fmt, ...)
> {
> + static DEFINE_SPINLOCK(panic_lock);
> static char buf[1024];
> va_list args;
> long i, i_next = 0;
> @@ -82,6 +83,13 @@ NORET_TYPE void panic(const char * fmt,
> #endif
>
> /*
> + * Only one CPU is allowed to execute the panic code from here. For
> + * multiple parallel invocations of panic all other CPUs will wait on
> + * the panic_lock. They are stopped afterwards by smp_send_stop().
> + */
> + spin_lock(&panic_lock);
> +
> + /*
> * If we have crashed and we have a crash kernel loaded let it handle
> * everything else.
> * Do we want to call this before we try to display a message?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists