linux-kernel - [PATCH] tick, broadcast: Prevent false alarm when force mask contains offline cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140326035648.21736.85740.stgit@preeti.in.ibm.com>
Date:	Wed, 26 Mar 2014 09:26:48 +0530
From:	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
To:	tglx@...utronix.de, mingo@...hat.com, linux-kernel@...r.kernel.org
Cc:	linux-pm@...r.kernel.org, peterz@...radead.org, rjw@...ysocki.net,
	srivatsa.bhat@...ux.vnet.ibm.com, paulmck@...ux.vnet.ibm.com,
	davem@...emloft.net
Subject: [PATCH] tick,
 broadcast: Prevent false alarm when force mask contains offline cpus

Its possible that the tick_broadcast_force_mask contains cpus which are not
in cpu_online_mask when a broadcast tick occurs. This could happen under the
following circumstance assuming CPU1 is among the CPUs waiting for broadcast.

CPU0					CPU1

Run CPU_DOWN_PREPARE notifiers

Start stop_machine			Gets woken up by IPI to run
					stop_machine, sets itself in
					tick_broadcast_force_mask if the
					time of broadcast interrupt is around
					the same time as this IPI.

					Start stop_machine
					  set_cpu_online(cpu1, false)
End stop_machine			End stop_machine

Broadcast interrupt
  Finds that cpu1 in
  tick_broadcast_force_mask is offline
  and triggers the WARN_ON in
  tick_handle_oneshot_broadcast()

Clears all broadcast masks
in CPU_DEAD stage.

This WARN_ON was added to capture scenarios where the broadcast mask, be it
oneshot/pending/force_mask contain offline cpus whose tick devices have been
removed. But here is a case where we trigger the warn on in a valid scenario.

One could argue that the scenario is invalid and ought to be warned against
because ideally the broadcast masks need to be cleared of the cpus about to
go offine before clearing them in the online_mask so that we dont hit these
scenarios.

This would mean clearing the masks in CPU_DOWN_PREPARE stage. But
it is quite possible that this stage itself will fail and cpu hotplug will
not go through. We would then end up in a situation where the cpu has not gone
offline, and continues to wait for the broadcast interrupt like before.
  However it is cleared in the broadcast masks and this interrupt will never
be delivered. Hence clearing of masks is best kept off until we are sure that
the cpu is dead, i.e. in the CPU_DEAD stage.

Hence simply ensure that the tick_broadcast_force_mask is a subset of the
online cpus to take care of rare occurences such as above. Moreover this is
not a harmful scenario where the cpu is in the mask but its tick device was
shutdown. The WARN_ON will then continue to capture cases where we could
possibly cause a kernel crash.

Signed-off-by: Preeti U Murthy <preeti@...ux.vnet.ibm.com>
---

 kernel/time/tick-broadcast.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 63c7b2d..30b8731 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -606,7 +606,12 @@ again:
 	 */
 	cpumask_clear_cpu(smp_processor_id(), tick_broadcast_pending_mask);

-	/* Take care of enforced broadcast requests */
+	/* Take care of enforced broadcast requests. We could have offline
+	 * cpus in the tick_broadcast_force_mask. Thats ok, we got the interrupt
+	 * before we could clear the mask.
+	 */
+	cpumask_and(tick_broadcast_force_mask,
+			tick_broadcast_force_mask, cpu_online_mask);
 	cpumask_or(tmpmask, tmpmask, tick_broadcast_force_mask);
 	cpumask_clear(tick_broadcast_force_mask);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/