lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sat, 18 Apr 2015 21:03:12 -0700
From:	Guenter Roeck <linux@...ck-us.net>
To:	Rabin Vincent <rabin@....in>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async()
 locking)

On 04/18/2015 08:39 PM, Rabin Vincent wrote:
> On Sat, Apr 18, 2015 at 06:56:02PM -0700, Guenter Roeck wrote:
>> Further debugging (with added WARN_ON if cpu != 0 in smp_call_function_single) shows:
>>
>> [<800157ec>] (unwind_backtrace) from [<8001250c>] (show_stack+0x10/0x14)
>> [<8001250c>] (show_stack) from [<80494cb4>] (dump_stack+0x88/0x98)
>> [<80494cb4>] (dump_stack) from [<80024058>] (warn_slowpath_common+0x84/0xb4)
>> [<80024058>] (warn_slowpath_common) from [<80024124>] (warn_slowpath_null+0x1c/0x24)
>> [<80024124>] (warn_slowpath_null) from [<80078fc8>] (smp_call_function_single+0x170/0x178)
>> [<80078fc8>] (smp_call_function_single) from [<80090024>] (perf_event_exit_cpu+0x80/0xf0)
>> [<80090024>] (perf_event_exit_cpu) from [<80090110>] (perf_cpu_notify+0x30/0x48)
>> [<80090110>] (perf_cpu_notify) from [<8003d340>] (notifier_call_chain+0x44/0x84)
>> [<8003d340>] (notifier_call_chain) from [<8002451c>] (_cpu_up+0x120/0x168)
>> [<8002451c>] (_cpu_up) from [<800245d4>] (cpu_up+0x70/0x94)
>> [<800245d4>] (cpu_up) from [<80624234>] (smp_init+0xac/0xb0)
>> [<80624234>] (smp_init) from [<80618d84>] (kernel_init_freeable+0x118/0x268)
>> [<80618d84>] (kernel_init_freeable) from [<8049107c>] (kernel_init+0x8/0xe8)
>> [<8049107c>] (kernel_init) from [<8000f320>] (ret_from_fork+0x14/0x34)
>> ---[ end trace 2f9f1bb8a47b3a1b ]---
>> smp_call_function_single, cpu=1, wait=1, csd_stack=87825ea0
>> generic_exec_single, cpu=1, smp_processor_id()=0
>> csd_lock_wait: csd=87825ea0, flags=0x3
>>
>> This is repeated for each secondary CPU. But the secondary CPUs don't respond because
>> they are not enabled, which I guess explains why the lock is never released.
>>
>> So, in other words, this happens because the system believes (presumably per configuration
>> / fdt data) that there are four CPU cores, but that is not really the case. Previously that
>> did not matter, and was handled correctly. Now it is fatal.
>>
>> Does this help ?
>
> 8053871d0f7f67c7efb7f226ef031f78877d6625 moved the csd locking to the
> callers, but the offline check, which was earlier done before the csd
> locking, was not moved.  The following moves the checks to the earlier
> point fixes your boot:
>

Yes, your patch fixes the problem.

Tested-by: Guenter Roeck <linux@...ck-us.net>

Thanks,
Guenter

> diff --git a/kernel/smp.c b/kernel/smp.c
> index 2aaac2c..ba1fb01 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -159,9 +159,6 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
>   	}
>
>
> -	if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))
> -		return -ENXIO;
> -
>   	csd->func = func;
>   	csd->info = info;
>
> @@ -289,6 +286,12 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>   	WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>   		     && !oops_in_progress);
>
> +	if (cpu != smp_processor_id()
> +	    && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) {
> +		err = -ENXIO;
> +		goto out;
> +	}
> +
>   	csd = &csd_stack;
>   	if (!wait) {
>   		csd = this_cpu_ptr(&csd_data);
> @@ -300,6 +303,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>   	if (wait)
>   		csd_lock_wait(csd);
>
> +out:
>   	put_cpu();
>
>   	return err;
> @@ -328,6 +332,12 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)
>
>   	preempt_disable();
>
> +	if (cpu != smp_processor_id()
> +	    && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) {
> +		err = -ENXIO;
> +		goto out;
> +	}
> +
>   	/* We could deadlock if we have to wait here with interrupts disabled! */
>   	if (WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK))
>   		csd_lock_wait(csd);
> @@ -336,8 +346,9 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)
>   	smp_wmb();
>
>   	err = generic_exec_single(cpu, csd, csd->func, csd->info);
> -	preempt_enable();
>
> +out:
> +	preempt_enable();
>   	return err;
>   }
>   EXPORT_SYMBOL_GPL(smp_call_function_single_async);
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ