linux-kernel - Re: [PATCH] fix cpu hotplug test failures on powerpc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1260958592.17860.25.camel@laptop>
Date:	Wed, 16 Dec 2009 11:16:32 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Xiaotian Feng <dfeng@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	Rusty Russell <rusty@...tcorp.com.au>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>
Subject: Re: [PATCH] fix cpu hotplug test failures on powerpc

On Wed, 2009-12-16 at 17:15 +0800, Xiaotian Feng wrote:
> Sachin found cpu hotplug test failures on powerpc, which made kernel
> hangs on his POWER box. This is addressed in
> http://marc.info/?l=linux-kernel&m=126052886204649&w=2
> 
> commit 6ad4c18(sched: Fix balance vs hotplug race), switches to
> cpu_active_mask, but at some specific situation, kernel may cause
> some cpu inactive but online.
> 
> In some powerpc machine, hotplug cpu0 is allowed. If cpu0 is the
> last alive cpu, when we tried to offline cpu0, we'll inactive cpu0
> in cpu_down(), after goes into __cpu_down(), kernel found num_online_cpus
> is 1, returned -EBUSY but cpu0 is not changed back to active. So
> cpu0 is inactive but online.
> 
> The fix is to set cpu inactive when we're going to bring down the specific
> cpu in _cpu_down().

Good spotting, thanks! Some comments below.

> Reported-by: Sachin Sant <sachinp@...ibm.com>
> Signed-off-by: Xiaotian Feng <dfeng@...hat.com>
> Tested-by: Sachin Sant <sachinp@...ibm.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Rusty Russell <rusty@...tcorp.com.au>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: H. Peter Anvin <hpa@...or.com>
> Cc: Heiko Carstens <heiko.carstens@...ibm.com>
> ---
>  kernel/cpu.c |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 291ac58..a1e7165 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -209,6 +209,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
>  		return -ENOMEM;
>  
>  	cpu_hotplug_begin();
> +	set_cpu_active(cpu, false);
>  	err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod,
>  					hcpu, -1, &nr_calls);
>  	if (err == NOTIFY_BAD) {
> @@ -280,8 +281,6 @@ int __ref cpu_down(unsigned int cpu)
>  		goto out;
>  	}
>  
> -	set_cpu_active(cpu, false);
> -
>  	/*
>  	 * Make sure the all cpus did the reschedule and are not
>  	 * using stale version of the cpu_active_mask.

That renders the synchronize_sched() call down there useless, so might
as well remove it then.

> @@ -387,12 +386,6 @@ int disable_nonboot_cpus(void)
>  	 */
>  	cpumask_clear(frozen_cpus);
>  
> -	for_each_online_cpu(cpu) {
> -		if (cpu == first_cpu)
> -			continue;
> -		set_cpu_active(cpu, false);
> -	}
> -
>  	synchronize_sched();

And here too.

>  	printk("Disabling non-boot CPUs ...\n");



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/