[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AAEF5DC.4070308@cn.fujitsu.com>
Date: Tue, 15 Sep 2009 10:03:08 +0800
From: Xiao Guangrong <xiaoguangrong@...fujitsu.com>
To: Suresh Siddha <suresh.b.siddha@...el.com>
CC: Peter Zijlstra <peterz@...radead.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"mm-commits@...r.kernel.org" <mm-commits@...r.kernel.org>,
"jens.axboe@...cle.com" <jens.axboe@...cle.com>,
"mingo@...e.hu" <mingo@...e.hu>,
"nickpiggin@...oo.com.au" <nickpiggin@...oo.com.au>,
"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch
added to -mm tree
Suresh Siddha wrote:
> On Mon, 2009-09-14 at 00:22 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>
>>>> CPU A call smp_call_function_many(wait=0) that want CPU B to call
>>>> a specific function, after smp_call_function_many() return, we let
>>>> CPU A offline immediately. Unfortunately, if CPU B receives this
>>>> IPI interrupt after CPU A down, it will crash like above description.
>>> How can cpu B receive the IPI interrupt after cpu A is down?
>>>
>>> As part of the cpu A going down, we first do the stop machine. i.e.,
>>> schedule the stop machine worker threads on each cpu. So, by the time
>>> all the worker threads on all the cpu's get scheduled and synchronized,
>>> ipi on B should get delivered.
>>>
>> Actually, those two examples have the same reason, that is how long
>> the destination CPU will receive the IPI interruption?
>>
>> If the stop machine threads can schedule in CPU B during the IPI
>> interruption delivering, It will occur those issue.
>>
>> I understand what you say but let me confuse is how we ensure it? The IPI
>> interruption is delivered over the APIC bus, It need several CPU instruction
>> cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
>> the answer, could you point out for me?
>
> Xiao, There is quite a bit of time between the time a particular cpu
> sends a smp call function IPI (with wait == 0) and the time that cpu
> starts running the stop machine thread and the whole system proceeds
> with stop machine. With in this time, typically the smp call function
> destination will receive the ipi interrupt. But theoretically the
> problem you explain might happen.
>
Yeah, though this case is very infrequent, but we can't avoid it.
>>>From P4 onwards, interrupts are delivered over system bus and with NHM
> it is QPI. Also, the mechanism of scheduling the stop machine thread on
> a particular cpu involves resched IPI etc.
>
> Nevertheless, Have you seen a real hang or system crash due to this? If
> so, on what platform?
>
> Ideally, for xapic based platform, clear status of sender APIC ICR's
> delivery status indicates that the interrupt is registered at the
> receiver. for x2apic based platform, sending another interrupt will
> ensure that the previous interrupt was delivered.
>
> If you have indeed seen a crash related to this, can you review and give
> the appended patch a try and see if it fixes the issue? If you agree
> with the fix, then I will send the patch with a detailed change log etc.
>
I not seen this crash, just afraid it's unsafe while I review the code,
I try to generate this crash, but as you know, the race point is hard
to control.
> Your current fix is not clean and not complete in my opinion (as calling
> interrupt handlers manually and not doing the callbacks etc might cause
> other side affects). Thanks.
It is not the last version and doing the callbacks in another patch,
see below URL please:
http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2
I think we do better handle this in CPU down path in kernel/smp.c, It's very
safe and let people easy to understand.
> ---
>
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 9e3d8af..69ec2a9 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -93,8 +93,9 @@ void generic_smp_call_function_single_interrupt(void);
> void generic_smp_call_function_interrupt(void);
> void ipi_call_lock(void);
> void ipi_call_unlock(void);
> void ipi_call_lock_irq(void);
> void ipi_call_unlock_irq(void);
> +void quiesce_smp_call_functions(void);
> #endif
>
> /*
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 8e21850..d13a888 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -479,6 +479,27 @@ int smp_call_function(void (*func)(void *), void *info, int wait)
> }
> EXPORT_SYMBOL(smp_call_function);
>
> +void quiesce_smp_call_functions(void)
> +{
> + struct call_single_queue *q = &__get_cpu_var(call_single_queue);
> + bool empty;
> + unsigned long flags;
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&q->lock, flags);
> + empty = list_empty(&q->list);
> + spin_unlock_irqrestore(&q->lock, flags);
> + } while (!empty);
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&call_function.lock, flags);
> + empty = list_empty(&call_function.queue);
> + spin_unlock_irqrestore(&call_function.lock, flags);
> + } while (!empty);
> +}
> +
Why we need waiting CPU to handle this? It make no sense because the CPU is dying,
we can simple ignore the IPI request.
> void ipi_call_lock(void)
> {
> spin_lock(&call_function.lock);
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 912823e..dd2d90f 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -86,6 +86,9 @@ static void stop_cpu(struct work_struct *unused)
> curstate = state;
> switch (curstate) {
> case STOPMACHINE_DISABLE_IRQ:
> +#ifdef CONFIG_USE_GENERIC_SMP_HELPERS
> + quiesce_smp_call_functions();
> +#endif
It seems ugly, we can define a noop function if CONFIG_USE_GENERIC_SMP_HELPERS is not
defined.
Another problem is that all CPU must call quiesce_smp_call_functions() here, but only
dying CPU need do it.
> local_irq_disable();
> hard_irq_disable();
It will cause another race, if CPU A send a IPI interruption after CPU B call
quiesce_smp_call_functions() and disable IRQ, it will case the same problem.
(in this time, CPU B is enter stop machine, but CPU A is not)
Thanks,
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists