[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52A1286E.9000606@gmail.com>
Date: Fri, 06 Dec 2013 05:29:18 +0400
From: Max Filippov <jcmvbkbc@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: Chuansheng Liu <chuansheng.liu@...el.com>, mingo@...nel.org,
Peter Zijlstra <peterz@...radead.org>, jbeulich@...e.com,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>, mina86@...a86.org,
"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>, jun.zhang@...el.com,
Fengguang Wu <fengguang.wu@...el.com>,
Alex Nemirovsky <Alex.Nemirovsky@...tina-systems.com>,
Artemi Ivanov <artemi.ivanov@...entembedded.com>
Subject: Re: [PATCH V2] smp: Give WARN()ing when calling smp_call_function_many()/single()
in serving irq
Hi Thomas,
On Fri, Jul 5, 2013 at 6:37 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Fri, 5 Jul 2013, Thomas Gleixner wrote:
>> On Sat, 16 Feb 2013, Chuansheng Liu wrote:
>> > Currently the functions smp_call_function_many()/single() will
>> > give a WARN()ing only in the case of irqs_disabled(), but that
>> > check is not enough to guarantee execution of the SMP
>> > cross-calls.
>> >
>> > In many other cases such as softirq handling/interrupt handling,
>> > the two APIs still can not be called, just as the
>> > smp_call_function_many() comments say:
>> >
>> > * You must not call this function with disabled interrupts or from a
>> > * hardware interrupt handler or from a bottom half handler. Preemption
>> > * must be disabled when calling this function.
>> >
>> > There is a real case for softirq DEADLOCK case:
>> >
>> > CPUA CPUB
>> > spin_lock(&spinlock)
>> > Any irq coming, call the irq handler
>> > irq_exit()
>> > spin_lock_irq(&spinlock)
>> > <== Blocking here due to
>> > CPUB hold it
>> > __do_softirq()
>> > run_timer_softirq()
>> > timer_cb()
>> > call smp_call_function_many()
>> > send IPI interrupt to CPUA
>> > wait_csd()
>> >
>> > Then both CPUA and CPUB will be deadlocked here.
>>
>> That's not true if called with wait = 0 as we won't wait for the csd
>> in that case. The function will be invoked on cpuA after it reenables
>> interrupt. So for callers who don't care about synchronous execution
>> it should not warn in softirq context.
>
> Hmm, even there it matters, because of the following scenario:
>
> CPU 0
> smp_call_function_single(CPU 1)
> csd_lock(CPU 1)
> irq_enter()
> irq_exit()
> __do_softirq()
> smp_call_function_many()
> setup csd (CPU 1)
> csd_lock(CPU 1) ==> CPU 0 deadlocked itself.
>
> And this is even more likely to happen than the lock issue.
I've observed similar deadlock in a real system which has network
driver that uses smp_call_function_single in the softirq context.
The proposed fix below keeps IRQs disabled on the sending CPU
during the period between marking csd locked and sending IPI,
making it possible to use smp_call_function_single from the softirq
context. What do you think?
--->8---
>From 5fa496ce12eaf994debab202cde618b9da7d9402 Mon Sep 17 00:00:00 2001
From: Max Filippov <jcmvbkbc@...il.com>
Date: Fri, 6 Dec 2013 04:50:03 +0400
Subject: [PATCH] smp: allow calling smp_call_function_single from softirq
This prevents the following deadlocks on the sending CPU by eliminating
interrupts between the point where CSD is locked and IPI is sent to peer
CPU.
Case 1:
CPU 0
smp_call_function_single(CPU 1, wait = 0)
csd_lock(CPU 0)
irq_enter()
irq_exit()
__do_softirq()
smp_call_function_single(CPU 1, wait = 0)
csd_lock(CPU 0) => deadlock, as csd will never be unlocked
Case 2:
CPU 0
smp_call_function_single(CPU 1, wait = 1)
csd_lock(on stack)
queue csd to CPU 1 call_single_queue
irq_enter()
irq_exit()
__do_softirq()
smp_call_function_single(CPU 1, wait = 1)
setup csd (on stack)
queue csd to CPU 1 call_single_queue
csd_lock_wait => never returns, as IPI was never sent to CPU 1
Signed-off-by: Max Filippov <jcmvbkbc@...il.com>
---
kernel/smp.c | 47 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 0564571..7bc9a01 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -122,6 +122,30 @@ static void csd_lock(struct call_single_data *csd)
smp_mb();
}
+static unsigned long csd_lock_irqsave(struct call_single_data *csd)
+{
+ unsigned long flags;
+
+ for (;;) {
+ csd_lock_wait(csd);
+ local_irq_save(flags);
+ if (csd->flags & CSD_FLAG_LOCK)
+ local_irq_restore(flags);
+ else
+ break;
+ }
+ csd->flags |= CSD_FLAG_LOCK;
+
+ /*
+ * prevent CPU from reordering the above assignment
+ * to ->flags with any subsequent assignments to other
+ * fields of the specified call_single_data structure:
+ */
+ smp_mb();
+
+ return flags;
+}
+
static void csd_unlock(struct call_single_data *csd)
{
WARN_ON(!(csd->flags & CSD_FLAG_LOCK));
@@ -140,16 +164,20 @@ static void csd_unlock(struct call_single_data *csd)
* ->func, ->info, and ->flags set.
*/
static
-void generic_exec_single(int cpu, struct call_single_data *csd, int wait)
+void generic_exec_single(int cpu, struct call_single_data *csd,
+ smp_call_func_t func, void *info, int wait)
{
struct call_single_queue *dst = &per_cpu(call_single_queue, cpu);
- unsigned long flags;
+ unsigned long flags = csd_lock_irqsave(csd);
int ipi;
- raw_spin_lock_irqsave(&dst->lock, flags);
+ csd->func = func;
+ csd->info = info;
+
+ raw_spin_lock(&dst->lock);
ipi = list_empty(&dst->list);
list_add_tail(&csd->list, &dst->list);
- raw_spin_unlock_irqrestore(&dst->lock, flags);
+ raw_spin_unlock(&dst->lock);
/*
* The list addition should be visible before sending the IPI
@@ -165,6 +193,8 @@ void generic_exec_single(int cpu, struct call_single_data *csd, int wait)
if (ipi)
arch_send_call_function_single_ipi(cpu);
+ local_irq_restore(flags);
+
if (wait)
csd_lock_wait(csd);
}
@@ -245,11 +275,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
if (!wait)
csd = &__get_cpu_var(csd_data);
- csd_lock(csd);
-
- csd->func = func;
- csd->info = info;
- generic_exec_single(cpu, csd, wait);
+ generic_exec_single(cpu, csd, func, info, wait);
} else {
err = -ENXIO; /* CPU not online */
}
@@ -335,8 +361,7 @@ void __smp_call_function_single(int cpu, struct call_single_data *csd,
csd->func(csd->info);
local_irq_restore(flags);
} else {
- csd_lock(csd);
- generic_exec_single(cpu, csd, wait);
+ generic_exec_single(cpu, csd, csd->func, csd->info, wait);
}
put_cpu();
}
--
1.8.1.4
--
Thanks.
-- Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists