linux-kernel - Re: ipc,sem: sysv semaphore scalability

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1364585774.31320.9.camel@thor.lan>
Date:	Fri, 29 Mar 2013 15:36:14 -0400
From:	Peter Hurley <peter@...leysoftware.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Dave Jones <davej@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...riel.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	hhuang@...hat.com, "Low, Jason" <jason.low2@...com>,
	Michel Lespinasse <walken@...gle.com>,
	Larry Woodman <lwoodman@...hat.com>,
	"Vinod, Chegu" <chegu_vinod@...com>,
	Stanislav Kinsbursky <skinsbursky@...allels.com>
Subject: Re: ipc,sem: sysv semaphore scalability

On Fri, 2013-03-29 at 12:26 -0700, Linus Torvalds wrote:
> On Fri, Mar 29, 2013 at 12:06 PM, Dave Jones <davej@...hat.com> wrote:
> >
> > Here's an oops I just hit..
> >
> > BUG: unable to handle kernel NULL pointer dereference at 000000000000000f
> > IP: [<ffffffff812c24ca>] testmsg.isra.5+0x1a/0x60
> 
> Btw, looking at the code leading up to this, what the f*ck is wrong
> with the IPC stuff?
> 
> It's using the generic list stuff for most of the lists, but then it
> open-codes the accesses.
> 
> So instead of using
> 
>    for_each_entry(walk_msg, &msq->q_messages, m_list) {
>       ..
>    }
> 
> the ipc/msg.c code does all that by hand, with
> 
>    tmp = msq->q_messages.next;
>    while (tmp != &msq->q_messages) {
>       struct msg_msg *walk_msg;
> 
>       walk_msg = list_entry(tmp, struct msg_msg, m_list);
>       ...
>       tmp = tmp->next;
>    }
> 
> Ugh. The code is near unreadable. And then it has magic memory
> barriers etc, implying that it doesn't lock the data structures, but
> no comments about them. See expunge_all() and pipelined_send().
> 
> The code seems entirely random, and it's badly set up (annoyance of
> the day: crazy helper functions in ipc/msgutil.c to make sure that (a)
> you have to spend more effort looking for them, and (b) they won't get
> inlined).
> 
> Clearly nobody has cared for the crazy IPC message code in a long time.

Exactly that's what my patch series does; clean this mess up.

This is what I wrote to Andrew a couple of days ago.

On Tue, 2013-03-26 at 22:33 -0400, Peter Hurley wrote:
I just figured out how the queue is being corrupted and why my series
> fixes it.
> 
> 
> With MSG_COPY set, the queue scan can exit with the local variable
'msg'
> pointing to a real msg if the msg_counter never reaches the
copy_number.
> 
> The remaining execution looks like this:
> 
> 	if (!IS_ERR(msg)) {
> 		....
> 		if (msgflg & MSG_COPY)
> 			goto out_unlock;
> 		....
> 
> out_unlock:
> 			msg_unlock(msq);
> 			break;
> 		}
> 	}
> 	if (IS_ERR(msg))
> 		....
> 
> 	bufsz = msg_handler();
> 	free_msg(msg);			<<---- msg never unlinked
> 
> 
> Since the msg should not have been found (because it failed the match
> criteria), the if (!IS_ERR(msg)) clause should never have executed.
> 
> That's why my refactor fixes resolve this; because msg is not
> inadvertently treated as a found msg.
> 
> But let's be honest; the real bug here is the poor structure of this
> function that even made this possible. The deepest nesting executes a
> goto to a label in the middle of an if clause. Yuck! No wonder this
> thing's fragile.
> 
> So my recommendation still stands. The series that fixes this has been
> getting tested in linux-next for a month. Fixing this some other way
is
> just asking for more trouble.
> 
> But why not just revert MSG_COPY altogether for 3.9?

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/