[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <C2D7FE5348E1B147BCA15975FBA2307514168F@IN01WEMBXA.internal.synopsys.com>
Date: Tue, 3 Sep 2013 09:51:58 +0000
From: Vineet Gupta <Vineet.Gupta1@...opsys.com>
To: Manfred Spraul <manfred@...orfullife.com>
CC: Vineet Gupta <Vineet.Gupta1@...opsys.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Davidlohr Bueso <dave.bueso@...il.com>,
Sedat Dilek <sedat.dilek@...il.com>,
Davidlohr Bueso <davidlohr.bueso@...com>,
linux-next <linux-next@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-mm <linux-mm@...ck.org>, Andi Kleen <andi@...stfloor.org>,
Rik van Riel <riel@...hat.com>,
"Jonathan Gonzalez" <jgonzalez@...ets.cl>
Subject: Re: ipc-msg broken again on 3.11-rc7?
On 09/03/2013 02:53 PM, Manfred Spraul wrote:
> On 09/03/2013 11:16 AM, Vineet Gupta wrote:
>> On 09/03/2013 02:27 PM, Manfred Spraul wrote:
>>> On 09/03/2013 10:44 AM, Vineet Gupta wrote:
>>>>> b) Could you check that it is not just a performance regression?
>>>>> Does ./msgctl08 1000 16 hang, too?
>>>> Nope that doesn't hang. The minimal configuration that hangs reliably is msgctl
>>>> 50000 2
>>>>
>>>> With this config there are 3 processes.
>>>> ...
>>>> 555 554 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>>> 554 551 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>>> 551 496 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>>> ...
>>>>
>>>> [ARCLinux]$ cat /proc/551/stack
>>>> [<80aec3c6>] do_wait+0xa02/0xc94
>>>> [<80aecad2>] SyS_wait4+0x52/0xa4
>>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>>
>>>> [ARCLinux]$ cat /proc/555/stack
>>>> [<80c2950e>] SyS_msgrcv+0x252/0x420
>>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>>
>>>> [ARCLinux]$ cat /proc/554/stack
>>>> [<80c28c82>] do_msgsnd+0x116/0x35c
>>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>>
>>>> Is this a case of lost wakeup or some such. I'm running with some more diagnostics
>>>> and will report soon ...
>>> What is the output of ipcs -q? Is the queue full or empty when it hangs?
>>> I.e. do we forget to wake up a receiver or forget to wake up a sender?
>> / # ipcs -q
>>
>> ------ Message Queues --------
>> key msqid owner perms used-bytes messages
>> 0x72d83160 163841 root 600 0 0
>>
>>
> Ok, a sender is sleeping - even though there are no messages in the queue.
> Perhaps it is the race that I mentioned in a previous mail:
>> for (;;) {
>> struct msg_sender s;
>>
>> err = -EACCES;
>> if (ipcperms(ns, &msq->q_perm, S_IWUGO))
>> goto out_unlock1;
>>
>> err = security_msg_queue_msgsnd(msq, msg, msgflg);
>> if (err)
>> goto out_unlock1;
>>
>> if (msgsz + msq->q_cbytes <= msq->q_qbytes &&
>> 1 + msq->q_qnum <= msq->q_qbytes) {
>> break;
>> }
>>
> [snip]
>> if (!pipelined_send(msq, msg)) {
>> /* no one is waiting for this message, enqueue it */
>> list_add_tail(&msg->m_list, &msq->q_messages);
>> msq->q_cbytes += msgsz;
>> msq->q_qnum++;
>> atomic_add(msgsz, &ns->msg_bytes);
> The access to msq->q_cbytes is not protected.
>
> Vineet, could you try to move the test for free space after ipc_lock?
> I.e. the lock must not get dropped between testing for free space and
> enqueueing the messages.
Hmm, the code movement is not trivial. I broke even the simplest of cases (patch
attached). This includes the additional change which Linus/Davidlohr had asked for.
-Vineet
View attachment "patch-ipc-2" of type "text/plain" (1075 bytes)
Powered by blists - more mailing lists