linux-kernel - Re: [PATCH] x86: make oops_begin and oops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081021144505.GA25750@hmsreliant.think-freely.org>
Date:	Tue, 21 Oct 2008 10:45:05 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Alexander van Heukelum <heukelum@...lshack.com>
Cc:	Ingo Molnar <mingo@...e.hu>, kexec@...ts.infradead.org,
	linux-kernel@...r.kernel.org, vgoyal@...hat.com, hbabu@...ibm.com,
	ebiederm@...ssion.com, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com, akpm@...ux-foundation.org,
	Alexander van Heukelum <heukelum@...tmail.fm>
Subject: Re: [PATCH] x86: make oops_begin and oops_end equal

On Mon, Oct 20, 2008 at 05:08:34PM +0200, Alexander van Heukelum wrote:
> Mostly use the x86_64 version of oops_begin() and oops_end() on
> i386 too. Changes to the original x86_64 version:
> 
Hey, doing a sight review this am here.  Didn't find anything major, but I did
find a few little nits.  comments inlie

>  - move add_taint(TAINT_DIE) into oops_end()
>  - add a conditional crash_kexec() into oops_end()
> 
> Signed-off-by: Alexander van Heukelum <heukelum@...tmail.fm>
> ---
>  arch/x86/kernel/dumpstack_32.c |   36 +++++++++++++++++++++---------------
>  arch/x86/kernel/dumpstack_64.c |    4 +++-
>  2 files changed, 24 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
> index b361475..e45952b 100644
> --- a/arch/x86/kernel/dumpstack_32.c
> +++ b/arch/x86/kernel/dumpstack_32.c
> @@ -289,35 +289,41 @@ static unsigned int die_nest_count;
>  
>  unsigned __kprobes long oops_begin(void)
>  {
> +	int cpu;
>  	unsigned long flags;
>  
>  	oops_enter();
>  
> -	if (die_owner != raw_smp_processor_id()) {
> -		console_verbose();
> -		raw_local_irq_save(flags);
> -		__raw_spin_lock(&die_lock);
> -		die_owner = smp_processor_id();
> -		die_nest_count = 0;
> -		bust_spinlocks(1);
> -	} else {
> -		raw_local_irq_save(flags);
> +	/* racy, but better than risking deadlock. */
> +	raw_local_irq_save(flags);
> +	cpu = smp_processor_id();
> +	if (!__raw_spin_trylock(&die_lock)) {
> +		if (cpu == die_owner)
> +			/* nested oops. should stop eventually */;
> +		else
> +			__raw_spin_lock(&die_lock);
>  	}
>  	die_nest_count++;
> +	die_owner = cpu;
> +	console_verbose();
> +	bust_spinlocks(1);
>  	return flags;
>  }
>  
>  void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
>  {
> -	bust_spinlocks(0);
>  	die_owner = -1;
> +	bust_spinlocks(0);
> +	die_nest_count--;
>  	add_taint(TAINT_DIE);
> -	__raw_spin_unlock(&die_lock);
> +	if (!die_nest_count)
> +		/* Nest count reaches zero, release the lock. */
> +		__raw_spin_unlock(&die_lock);
>  	raw_local_irq_restore(flags);
> -
> -	if (!regs)
> +	if (!regs) {
> +		oops_exit();
>  		return;
> -
> +	}
>  	if (kexec_should_crash(current))
>  		crash_kexec(regs);
Hmm.  I think this creates the same case that I just fixed in my initial post.
If we start using oops_end with this here, it may be possible to call
crash_kexec with the console_sem held.  If that happens, we deadlock.  I think
you should be able to move this clause up above the bust_spinlocks(0) without
any issue, and that would take care of that

>  	if (in_interrupt())
> @@ -405,6 +411,7 @@ die_nmi(char *str, struct pt_regs *regs, int do_panic)
>  		panic("Non maskable interrupt");
>  	console_silent();
>  	spin_unlock(&nmi_print_lock);
> +	bust_spinlocks(0);
>  
>  	/*
>  	 * If we are in kernel we are probably nested up pretty bad
> @@ -415,7 +422,6 @@ die_nmi(char *str, struct pt_regs *regs, int do_panic)
>  		crash_kexec(regs);
>  	}
>  
> -	bust_spinlocks(0);
>  	do_exit(SIGSEGV);
>  }
>  
This undoes my previous patch.  I realize your second patch fixes it properly so
the ordering is correct when oops_begin and oops_end are used, but if you could
rediff so this isn't here, I'd appreciate it.  If these patches are committed
separately, you'll avoid having the tree in a state where that deadlock can
reoccur (even if it is just for one commit)

> diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
> index 96a5db7..cd7b46b 100644
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -461,6 +461,7 @@ void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
>  	die_owner = -1;
>  	bust_spinlocks(0);
>  	die_nest_count--;
> +	add_taint(TAINT_DIE);
>  	if (!die_nest_count)
>  		/* Nest count reaches zero, release the lock. */
>  		__raw_spin_unlock(&die_lock);
> @@ -469,6 +470,8 @@ void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
>  		oops_exit();
>  		return;
>  	}
> +	if (kexec_should_crash(current))
> +		crash_kexec(regs);
If you're going to add the crash_kexec here (which looking at the call sites,
makes sense to me), you should likely remove it from the critical section of die
and die_nmi, just to avoid the redundancy.  Same issue as the 32 bit version
above applies, this needs to happen before you call bust_spinlocks(0).

Fix those issues, and the rest looks good to me.

Regards
Neil

-- 
/****************************************************
 * Neil Horman <nhorman@...driver.com>
 * Software Engineer, Red Hat
 ****************************************************/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/