lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed,  5 Sep 2018 12:15:17 -0700
From:   Prakruthi Deepak Heragu <pheragu@...eaurora.org>
To:     tglx@...utronix.de
Cc:     linux-kernel@...r.kernel.org, tsoni@...eaurora.org,
        ckadabi@...eaurora.org, bryanh@...eaurora.org,
        psodagud@...eaurora.org,
        Prakruthi Deepak Heragu <pheragu@...eaurora.org>
Subject: [PATCH] kernel: cpu: Handle hotplug failure for state CPUHP_AP_IDLE_DEAD

Once the tear down hotplug handler is run, cpu is dead and enters
into CPUHP_AP_IDLE_DEAD state. Any callbacks that fail in the state
machine with state < CPUHP_AP_IDLE must be treated as fatal as this
could result into timer not beig migrated away from dead cpu and run
into issues like work queue lock ups, sched_clock timer wrapping to
zero as sched_clock_poll which is in the hrtimer base of cpu being
hotplugged does not get migrated.

The function sched_clock_poll() updates the epoch_ns and epoch_cyc. If
this function present in the hrtimer base of cpu being hotplugged
doesn't migrate, there is no update on the epoch_ns and epoch_cyc.
Subseqently, when sched_clock() is called, the non updated values of
epoch_ns and epoch_cyc are obtained which looks like the timer wrapped
around.
[ 8792.168842] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=6801s workers=2 manager: 4884
[ 8792.168862] pool 16: cpus=0-7 flags=0x4 nice=0 hung=0s workers=34 idle: 4482 1390 1394 1396 4492 5442 5447 5445
[    0.017714] Modules linked in: wlan(O)
[    0.017733] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W  O    4.9.37+ #1
[    0.017746] task: ffffffc1b05c8080 task.stack: ffffffc1b05c4000

As seen, the time rolls over to 0 after 8792.

Signed-off-by: Channagoud Kadabi <ckadabi@...eaurora.org>
Signed-off-by: Prakruthi Deepak Heragu <pheragu@...eaurora.org>
---
 kernel/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 0db8938..51fa38f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -837,6 +837,7 @@ static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
 
 	for (; st->state > target; st->state--) {
 		ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+		BUG_ON(ret && st->state < CPUHP_AP_IDLE_DEAD);
 		if (ret) {
 			st->target = prev_state;
 			undo_cpu_down(cpu, st);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ