lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 26 May 2014 16:38:38 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	peterz@...radead.org, tglx@...utronix.de, mingo@...nel.org,
	tj@...nel.org, rusty@...tcorp.com.au, akpm@...ux-foundation.org,
	fweisbec@...il.com, hch@...radead.org
Cc:	mgorman@...e.de, riel@...hat.com, bp@...e.de, rostedt@...dmis.org,
	mgalbraith@...e.de, ego@...ux.vnet.ibm.com,
	paulmck@...ux.vnet.ibm.com, oleg@...hat.com, rjw@...ysocki.net,
	linux-kernel@...r.kernel.org, srivatsa.bhat@...ux.vnet.ibm.com
Subject: [PATCH v7 2/2] CPU hotplug,
 smp: Flush any pending IPI callbacks before CPU offline

During CPU offline, in stop-machine, we don't enforce any rule in the
_DISABLE_IRQ stage, regarding the order in which the outgoing CPU and the other
CPUs disable their local interrupts. Hence, we can encounter a scenario as
depicted below, in which IPIs are sent by the other CPUs to the CPU going
offline (while it is *still* online), but the outgoing CPU notices them only
*after* it has gone offline.


              CPU 1                                         CPU 2
          (Online CPU)                               (CPU going offline)

       Enter _PREPARE stage                          Enter _PREPARE stage

                                                     Enter _DISABLE_IRQ stage


                                                   =
       Got a device interrupt,                     | Didn't notice the IPI
       and the interrupt handler                   | since interrupts were
       called smp_call_function()                  | disabled on this CPU.
       and sent an IPI to CPU 2.                   |
                                                   =


       Enter _DISABLE_IRQ stage


       Enter _RUN stage                              Enter _RUN stage

                                  =
       Busy loop with interrupts  |                  Invoke take_cpu_down()
       disabled.                  |                  and take CPU 2 offline
                                  =


       Enter _EXIT stage                             Enter _EXIT stage

       Re-enable interrupts                          Re-enable interrupts

                                                     The pending IPI is noted
                                                     immediately, but alas,
                                                     the CPU is offline at
                                                     this point.



This of course, makes the smp-call-function IPI handler code unhappy and it
complains about "receiving an IPI on an offline CPU".

However, if we look closely, we observe that the IPI was sent when CPU 2 was
still online, and hence it was perfectly legal for CPU 1 to send the IPI at
that point. Furthermore, receiving an IPI on an offline CPU is terrible only
if there were pending callbacks yet to be executed by that CPU (in other words,
its a bug if the CPU went offline with work still pending).

So, fix this by flushing all the queued smp-call-function callbacks on the
outgoing CPU in the CPU_DYING stage[1], including those callbacks for which the
source CPU's IPIs might not have been received on the outgoing CPU yet. This
ensures that all pending IPI callbacks are run before the CPU goes completely
offline. But note that the outgoing CPU can still get IPIs from the other CPUs
just after it exits stop-machine, due to the scenario mentioned above. But
because we flush the callbacks before going offline, this will be completely
harmless.

Further, this solution also guarantees that there will be pending callbacks
on an offline CPU *only if* the source CPU initiated the IPI-send-procedure
*after* the target CPU went offline, which clearly indicates a bug in the
sender code.

So, considering all this, teach the smp-call-function IPI handler code to
complain only if an offline CPU received an IPI *and* it still had pending
callbacks to execute, since that is the only buggy scenario.

There is another case (somewhat theoretical though) where IPIs might arrive
late on the target CPU (possibly _after_ the CPU has gone offline): due to IPI
latencies in the hardware. But with this patch, even this scenario turns out
to be harmless, since we explicitly loop through the call_single_queue and
flush out any pending callbacks without waiting for the corresponding IPIs
to arrive.


[1]. The CPU_DYING part needs a little more explanation: by the time we
execute the CPU_DYING notifier callbacks, the CPU would have already been
marked offline. But we want to flush out the pending callbacks at this stage,
ignoring the fact that the CPU is offline. So restructure the IPI handler
code so that we can by-pass the "is-cpu-offline?" check in this particular
case. (Of course, the right solution here is to fix CPU hotplug to mark the
CPU offline _after_ invoking the CPU_DYING notifiers, but this requires a
lot of audit to ensure that this change doesn't break any existing code;
hence lets go with the solution proposed above until that is done).

Suggested-by: Frederic Weisbecker <fweisbec@...il.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
---

 kernel/smp.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 306f818..5295388 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -29,6 +29,8 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_function_data, cfd_data);
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue);
 
+static void flush_smp_call_function_queue(bool warn_cpu_offline);
+
 static int
 hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -52,6 +54,20 @@ hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
 
+	case CPU_DYING:
+	case CPU_DYING_FROZEN:
+		/*
+		 * The IPIs for the smp-call-function callbacks queued by other
+		 * CPUs might arrive late, either due to hardware latencies or
+		 * because this CPU disabled interrupts (inside stop-machine)
+		 * before the IPIs were sent. So flush out any pending callbacks
+		 * explicitly (without waiting for the IPIs to arrive), to
+		 * ensure that the outgoing CPU doesn't go offline with work
+		 * still pending.
+		 */
+		flush_smp_call_function_queue(false);
+		break;
+
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 		free_cpumask_var(cfd->cpumask);
@@ -177,23 +193,47 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
 	return 0;
 }
 
-/*
- * Invoked by arch to handle an IPI for call function single. Must be
- * called from the arch with interrupts disabled.
+/**
+ * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks
+ *
+ * Invoked by arch to handle an IPI for call function single.
+ * Must be called with interrupts disabled.
  */
 void generic_smp_call_function_single_interrupt(void)
 {
+	flush_smp_call_function_queue(true);
+}
+
+/**
+ * flush_smp_call_function_queue - Flush pending smp-call-function callbacks
+ *
+ * @warn_cpu_offline: If set to 'true', warn if callbacks were queued on an
+ * 		      offline CPU. Skip this check if set to 'false'.
+ *
+ * Flush any pending smp-call-function callbacks queued on this CPU. This is
+ * invoked by the generic IPI handler, as well as by a CPU about to go offline,
+ * to ensure that all pending IPI functions are run before it goes completely
+ * offline.
+ *
+ * Loop through the call_single_queue and run all the queued functions.
+ * Must be called with interrupts disabled.
+ */
+static void flush_smp_call_function_queue(bool warn_cpu_offline)
+{
+	struct llist_head *head;
 	struct llist_node *entry;
 	struct call_single_data *csd, *csd_next;
 	static bool warned;
 
-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	WARN_ON(!irqs_disabled());
+
+	head = &__get_cpu_var(call_single_queue);
+	entry = llist_del_all(head);
 	entry = llist_reverse_order(entry);
 
-	/*
-	 * Shouldn't receive this interrupt on a cpu that is not yet online.
-	 */
-	if (unlikely(!cpu_online(smp_processor_id()) && !warned)) {
+	/* There shouldn't be any pending callbacks on an offline CPU. */
+	if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) &&
+		     !warned && !llist_empty(head))) {
 		warned = true;
 		WARN(1, "IPI on offline CPU %d\n", smp_processor_id());
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ