[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5225AA8D.6080403@colorfullife.com>
Date: Tue, 03 Sep 2013 11:23:25 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Vineet Gupta <Vineet.Gupta1@...opsys.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Davidlohr Bueso <dave.bueso@...il.com>,
Sedat Dilek <sedat.dilek@...il.com>,
Davidlohr Bueso <davidlohr.bueso@...com>,
linux-next <linux-next@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-mm <linux-mm@...ck.org>, Andi Kleen <andi@...stfloor.org>,
Rik van Riel <riel@...hat.com>,
Jonathan Gonzalez <jgonzalez@...ets.cl>
Subject: Re: ipc-msg broken again on 3.11-rc7?
On 09/03/2013 11:16 AM, Vineet Gupta wrote:
> On 09/03/2013 02:27 PM, Manfred Spraul wrote:
>> On 09/03/2013 10:44 AM, Vineet Gupta wrote:
>>>> b) Could you check that it is not just a performance regression?
>>>> Does ./msgctl08 1000 16 hang, too?
>>> Nope that doesn't hang. The minimal configuration that hangs reliably is msgctl
>>> 50000 2
>>>
>>> With this config there are 3 processes.
>>> ...
>>> 555 554 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>> 554 551 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>> 551 496 root S 1208 0.4 0 0.0 ./msgctl08 50000 2
>>> ...
>>>
>>> [ARCLinux]$ cat /proc/551/stack
>>> [<80aec3c6>] do_wait+0xa02/0xc94
>>> [<80aecad2>] SyS_wait4+0x52/0xa4
>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>
>>> [ARCLinux]$ cat /proc/555/stack
>>> [<80c2950e>] SyS_msgrcv+0x252/0x420
>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>
>>> [ARCLinux]$ cat /proc/554/stack
>>> [<80c28c82>] do_msgsnd+0x116/0x35c
>>> [<80ae24fc>] ret_from_system_call+0x0/0x4
>>>
>>> Is this a case of lost wakeup or some such. I'm running with some more diagnostics
>>> and will report soon ...
>> What is the output of ipcs -q? Is the queue full or empty when it hangs?
>> I.e. do we forget to wake up a receiver or forget to wake up a sender?
> / # ipcs -q
>
> ------ Message Queues --------
> key msqid owner perms used-bytes messages
> 0x72d83160 163841 root 600 0 0
>
>
Ok, a sender is sleeping - even though there are no messages in the queue.
Perhaps it is the race that I mentioned in a previous mail:
> for (;;) {
> struct msg_sender s;
>
> err = -EACCES;
> if (ipcperms(ns, &msq->q_perm, S_IWUGO))
> goto out_unlock1;
>
> err = security_msg_queue_msgsnd(msq, msg, msgflg);
> if (err)
> goto out_unlock1;
>
> if (msgsz + msq->q_cbytes <= msq->q_qbytes &&
> 1 + msq->q_qnum <= msq->q_qbytes) {
> break;
> }
>
[snip]
> if (!pipelined_send(msq, msg)) {
> /* no one is waiting for this message, enqueue it */
> list_add_tail(&msg->m_list, &msq->q_messages);
> msq->q_cbytes += msgsz;
> msq->q_qnum++;
> atomic_add(msgsz, &ns->msg_bytes);
The access to msq->q_cbytes is not protected.
Vineet, could you try to move the test for free space after ipc_lock?
I.e. the lock must not get dropped between testing for free space and
enqueueing the messages.
--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists