linux-kernel - Enabling CONFIG_HOTPLUG_CPU for CPU that does not have hardware support hot-plug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <BC373419EB4337418B2B595BAEDC155F01605CDD46A1@IL-MB01.marvell.com>
Date:	Tue, 22 Oct 2013 12:39:03 +0200
From:	Kosta Zertsekel <konszert@...vell.com>
To:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:	Eran Ben-Avi <benavi@...vell.com>,
	Nadav Haklai <nadavh@...vell.com>,
	Lior Amsalem <alior@...vell.com>
Subject: Enabling CONFIG_HOTPLUG_CPU for CPU that does not have hardware
 support hot-plug

Hi guys,	

The question
------------------
What are the possible drawbacks of enabling CONFIG_HOTPLUG_CPU for CPU
that does not have hardware support for hot-plug?

The question I'd like to ask is architecture agnostic, but the described behavior
is observed on MPCore Cortex-A9 CPU with Linux 3.4.59.

The issue
-------------
When Linux Kernel compiled in SMP mode, and CONFIG_HOTPLUG_CPU is not set,
and booted on single core CPU, then warning messages 
"... task blocked for more than 120 seconds ..." starts popping up in dmesg log.

For example:
INFO: task ksoftirqd/1:9 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

To make the message disappear, CONFIG_HOTPLUG_CPU should be enabled.

Following the example of "ksoftirqd", the root cause of the issue is that
there are as many "ksoftirqd" threads created as CONFIG_NR_CPU
(see cpu_present_mask in kernel/cpu.c file). See below some details on
how "ksoftirqd" task is created and why it is not killed.

Now, the "ksoftirqd" task is *not* killed and just stays around in task queue
till the scheduler shouts "... task blocked ...".

	Details:
	----------
	The first "ksoftirqd" task (for CPU[0]) is created as part of executing
	the registered early_initcall function spawn_ksoftirqd() in the below
	flow:

	start_kernel() ---> rest_init() ---> kernel_init() --->
	do_pre_smp_initcalls() ---> spawd_ksoftirqd() --->
	cpu_callback(... CPU_ONLINE ...)

	The "ksoftirqd" tasks for CPU[1 .. N-1] are created in the different flow.
	First of all, cpu_callback() from kernel/softirq.c ("ksoftirqd" task is
	created in this callback) is registered through CPU notifier in
	spawn_ksoftirqd(). Then this callback is called in the below flow:

	start_kernel() ---> rest_init() ---> kernel_init() --->
	smp_init() ---> for_each_present_cpu(cpu) { cpu_up(cpu) --->
	_cpu_up() ---> __cpu_nofity(CPU_UP_PREPARE) (here "ksoftirqd" task
	is created).

	Right after that, CPU[x] is attempted to be enabled (using __cpu_up) and,
	if __cpu_up(cpu) fails, then "ksoftirqd" task is killed using
	__cpu_notify(CPU_UP_CANCELLED) some lines below.

	Now, the code that actually kills the task (using kthread_stop) is wrapped up
	with #ifdef CONFIG_HOTPLUG_CPU (see kernel/softirq.c, function cpu_callback).

The solution
-----------------
The easy solution is to enable CONFIG_HOTPLUG_CPU which enables
the compilation of the code that kills "ksoftirqd" task.
What is the possible drawback of enabling CONFIG_HOTPLUG_CPU for CPU
that does not have hardware support hot-plug?

Thanks,
--- KostaZ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/