[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5224BCF6.2080401@colorfullife.com>
Date: Mon, 02 Sep 2013 18:29:42 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Vineet Gupta <Vineet.Gupta1@...opsys.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Davidlohr Bueso <dave.bueso@...il.com>,
Sedat Dilek <sedat.dilek@...il.com>,
Davidlohr Bueso <davidlohr.bueso@...com>,
linux-next <linux-next@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-mm <linux-mm@...ck.org>, Andi Kleen <andi@...stfloor.org>,
Rik van Riel <riel@...hat.com>,
Jonathan Gonzalez <jgonzalez@...ets.cl>
Subject: Re: ipc-msg broken again on 3.11-rc7?
Hi,
[forgot to cc everyone, thus I'll summarize some mails...]
On 09/02/2013 06:58 AM, Vineet Gupta wrote:
> On 08/31/2013 11:20 PM, Linus Torvalds wrote:
>> Vineet, actual patch for what Davidlohr suggests attached. Can you try it?
>>
>> Linus
> Apologies for late in getting back to this - I was away from my computer for a bit.
>
> Unfortunately, with a quick test, this patch doesn't help.
> FWIW, this is latest mainline (.config attached).
>
> Let me know what diagnostics I can add to help with this.
msgctl08 is a bulk message send/receive test. I had to look at it once
before, then it was a broken hardware:
https://lkml.org/lkml/2008/6/12/365
This can be ruled out, because it works with 3.10.
msgctl08 uses pairs of threads: one thread does msgsnd(), the other one
msgrcv().
There is no synchronization, i.e. the msgsnd() can race ahead until the
kernel buffer is full and then a block with msgrcv() follows or it could
be pairs of alternating msgsnd()/msgrcv() operations.
No special features are used: each pair of threads has it's own message
queues, all messages have type=1.
Vineet ran strace - and just before the signal from killing msgctl08,
there are only msgsnd()/msgrcv() calls.
Vineet:
a) could you run strace tomorrow again, with '-ttt' as an additional
option? I don't see where exactly it hangs.
b) Could you check that it is not just a performance regression?
Does ./msgctl08 1000 16 hang, too?
In ipc/msg.c, I haven't seen any obvious reason why it should hang.
The only race I spotted so far is this one:
> for (;;) {
> struct msg_sender s;
>
> err = -EACCES;
> if (ipcperms(ns, &msq->q_perm, S_IWUGO))
> goto out_unlock1;
>
> err = security_msg_queue_msgsnd(msq, msg, msgflg);
> if (err)
> goto out_unlock1;
>
> if (msgsz + msq->q_cbytes <= msq->q_qbytes &&
> 1 + msq->q_qnum <= msq->q_qbytes) {
> break;
> }
>
[snip]
> if (!pipelined_send(msq, msg)) {
> /* no one is waiting for this message, enqueue it */
> list_add_tail(&msg->m_list, &msq->q_messages);
> msq->q_cbytes += msgsz;
> msq->q_qnum++;
> atomic_add(msgsz, &ns->msg_bytes);
The access to msq->q_cbytes is not protected. Thus two parallel msgsnd()
calls could succeed, even if both together brings the queue length above
the limit.
But it can't explain why 3.11-rc7 hangs: As explained above, msgctl08
uses one queue for each thread pair.
--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists