[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46FCEE9A.5050806@bull.net>
Date:	Fri, 28 Sep 2007 14:07:54 +0200
From:	Laurent Vivier <Laurent.Vivier@...l.net>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Fengguang Wu <wfg@...l.ustc.edu.cn>, Andi Kleen <ak@...e.de>,
	linux-kernel@...r.kernel.org
Subject: Re: WARNING: at arch/x86_64/kernel/smp.c:397 smp_call_function_mask()
Andrew Morton wrote:
> On Fri, 28 Sep 2007 11:18:45 +0200 Laurent Vivier <Laurent.Vivier@...l.net> wrote:
> 
>> Andrew Morton wrote:
>>> On Fri, 28 Sep 2007 10:52:08 +0200 Laurent Vivier <Laurent.Vivier@...l.net> wrote:
>>>> Andi, is this correct ?
>>>> Andrew, should I send a patch implementing this change ?
>>> umm, I think all the smp_call_function fucntions are deadlocky if called
>>> with local interrupts disabled, regardless of whether the calling CPU is in
>>> the mask.
>>>
>>> If CPU A is sending a cross-cpu call to CPU B and CPU B is sending a
>>> cross-cpu call to CPU A, and they both have local interrupts disabled...
>> OK, so there are two errors:
>>
>> 1- one I introduce myself (without any help from anyone) where
>> smp_call_function() calls all online CPUs instead of calling all CPUs except itself.
> 
> I'd be pretty surprised if one was able to write a bug like that.  You mean
> the CPU sends an IPI to itself and then loops around until it has serviced
> that IPI?  And this works?  Wow.
In fact, it works because __smp_call_function_mask() makes :
 cpus_and(mask, mask, allbutself);
So it removes current cpu from the mask. BTW, I don't have to modify
smp_call_function(): it is correct as it is written (except from a semantic
point of view... do we care about semantic ?).
> And on_each_cpu() can call the handler function twice?
> 
> hm
> 
>> 2- one in global_flush_tlb() which calls smp_call_function() with irqs disabled.
> 
> That would be a big bug, and surely we would already have picked it up.
> 
> <looks>
> 
> argh, mainline's x86_64 smp_call_function() doesn't do the check.  We've
> had *heaps* of bugs in i386 where people were running smp_call_foo() under
> local_irq_disable().  I wonder how many there are in x86_64?
> 
>> I think I should at least correct #1 ?
> 
> I think we should correct all bugs ;)
> 
We don't live in a perfect world... ;-)
Moreover, I'm not able to reproduce #2 ...
Fengguang could you send me your .config and your qemu command line parameters ?
Laurent
PS: if semantic is important, you can apply attached patch...
-- 
------------- Laurent.Vivier@...l.net  --------------
          "Software is hard" - Donald Knuth
Download attachment "0001-smp_call_function-call-function-on-all-CPUs.patch" of type "application/mbox" (814 bytes)
Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)
Powered by blists - more mailing lists
 
