lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 Feb 2009 10:07:35 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Vegard Nossum <vegard.nossum@...il.com>, stable@...nel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nick Piggin <npiggin@...e.de>,
	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)


* Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

> On Sat, Feb 21, 2009 at 07:00:30PM -0800, Paul E. McKenney wrote:
> > On Sat, Feb 21, 2009 at 07:37:20PM +0100, Vegard Nossum wrote:
> > > 2009/2/21 Vegard Nossum <vegard.nossum@...il.com>:
> 
> [ . . . ]
> 
> > > Okay, I don't really think it's an error. The if (user) test happens
> > > at the very beginning and gcc decides to reuse %edx. GDB doesn't know
> > > this, so it thinks the parameter changed, but at this point the
> > > parameter simply won't be used anymore.
> > > 
> > > So you're right: The value can't be trusted (after entry, anyway).
> > 
> > OK.  So at least the compiler is sane.  ;-)
> > 
> > And the fact that RCU Classic behaves the same as hierarchical RCU
> > pretty clearly points at some issue with the quiescent-state check code:
> > 
> > void rcu_check_callbacks(int cpu, int user)
> > {
> > 	if (user ||
> > 	    (idle_cpu(cpu) && !in_softirq() &&
> > 				hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> > 		rcu_qsctr_inc(cpu);
> > 		rcu_bh_qsctr_inc(cpu);
> > 	} else if (!in_softirq()) {
> > 		rcu_bh_qsctr_inc(cpu);
> > 	}
> > 	raise_softirq(RCU_SOFTIRQ);
> > }
> > 
> > In the case you traced earlier, we interrupted out of kernel code, yet
> > somehow arrived at rcu_qsctr_inc().  We know that "user" really was 0,
> > thanks to your careful analysis, so the issue must be in the other
> > clause.  Since we interrupted out of mainline kernel code, in_softirq()
> > should have returned 0, and hardirq_count() should also have met the
> > above condition.
> > 
> > You mentioned some concern about idle_cpu() separately, and if idle_cpu()
> > was returning 1, then RCU would most certainly decide that it was in a
> > quiescent state and that it could end the current grace period.
> 
> Hello, Vegard,
> 
> Could you please try out the following patch?  I am not 100% 
> confident of it on non-x86 architectures, nor during the time 
> that non-boot CPUs start up (though this patch should not 
> break non-boot CPUs any more than they might already be 
> broken).
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> The boot CPU runs in the context of its idle thread during 
> boot-up. During this time, idle_cpu(0) will always return 
> nonzero, which will fool Classic and Hierarchical RCU into 
> deciding that a large chunk of the boot-up sequence is a big 
> long quiescent state.  This in turn causes RCU to prematurely 
> end grace periods during this time.

ah, that makes a lot of sense and explains it all! What a nasty 
little bug we had all along ...

> This patch creates a new global variable that is set to 1 just 
> before the boot CPU first enters the scheduler, after which 
> the idle task really is idle.
> 
> Located-by: Vegard Nossum <vegard.nossum@...il.com>
> Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

Please also add kmemcheck to the changelog while at it ;-)

> ---
> 
>  init/main.c         |    3 +++
>  kernel/rcuclassic.c |    4 +++-
>  kernel/rcutree.c    |    4 +++-
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/init/main.c b/init/main.c
> index 8442094..51f4b71 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -121,6 +121,8 @@ static char *static_command_line;
>  static char *execute_command;
>  static char *ramdisk_execute_command;
>  
> +int idle_task_is_really_idle;	/* set to 1 late in boot. */
> +
>  #ifdef CONFIG_SMP
>  /* Setup configured maximum number of CPUs to activate */
>  unsigned int __initdata setup_max_cpus = NR_CPUS;
> @@ -463,6 +465,7 @@ static noinline void __init_refok rest_init(void)
>  	 * at least once to get things moving:
>  	 */
>  	init_idle_bootup_task(current);
> +	idle_task_is_really_idle = 1;
>  	preempt_enable_no_resched();
>  	schedule();
>  	preempt_disable();

Could you please use system_state instead? We could insert a new 
stage - or just use SYSTEM_RUNNING as the trigger.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ