linux-kernel - Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CA+55aFxPgGZ2k81Leo-xVVBPThmh29GoCXeq3-iK88KhFNuPdQ@mail.gmail.com>
Date:	Fri, 1 Mar 2013 10:52:12 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Davidlohr Bueso <davidlohr.bueso@...com>
Cc:	Rik van Riel <riel@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	"Vinod, Chegu" <chegu_vinod@...com>,
	"Low, Jason" <jason.low2@...com>,
	linux-tip-commits@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>, aquini@...hat.com,
	Michel Lespinasse <walken@...gle.com>,
	Ingo Molnar <mingo@...nel.org>,
	Larry Woodman <lwoodman@...hat.com>
Subject: Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock
 out of line

On Fri, Mar 1, 2013 at 10:18 AM, Davidlohr Bueso <davidlohr.bueso@...com> wrote:
> On Fri, 2013-03-01 at 01:42 -0500, Rik van Riel wrote:
>>
>> Checking try_atomic_semop and do_smart_update, it looks like neither
>> is using atomic operations. That part of the semaphore code would
>> still benefit from spinlocks.
>
> Agreed.

Yup. As mentioned, I hadn't even looked at that part of the code, but
yes, it definitely wants the spinlock.

> How about splitting ipc_lock()/ipc_lock_control() in two calls: one to
> obtain the ipc object (rcu_read_lock + idr_find), which can be called
> when performing the permissions and security checks, and another to
> obtain the ipcp->lock [q_]spinlock when necessary.

That sounds like absolutely the right thing to do. And we can leave
the old helper functions that do both of them around, and only use the
split case for just a few places.

And if we make the RCU read-lock be explicit too, we could get rid of
some of the ugliness. Right now we have semtimedop() do things like a
conditional "find_alloc_undo()", which will get the RCU lock. It would
be much nicer if we just cleaned up the interfaces a tiny bit, said
that the *caller* has to get the RCU lock, and just do this
unconditionally before calling any of it. Because right now the RCU
details are quite messy, and we have code like

                if (un)
                        rcu_read_unlock();
                error = PTR_ERR(sma);
                goto out_free;

etc, when it would actually be much simpler to just do the RCU read
locking unconditionally (we'll need it for the semaphore lookup
anyway) and just have the exit paths unlock unconditionally like we
usually do (ie a few different exit goto's that just nest the
unlocking properly).

It would simplify the odd locking both for humans and for code
generation. Right now we actually nest those RCU read locks two deep,
as far as I can see.

And it looks like this could be done in fairly small independent steps
("add explicit RCU locking", "split up helper functions", "start using
the simplified helper functions in selected paths that care").

It won't *solve* the locking issues, and I'm sure we'll still see
contention, but it should clean the code up and probably helps the
contention numbers visibly. Even if it's only 10% (which, judging by
my profiles, would be a safe lower bound from just moving the security
callback out of the spinlock - and it could easily be 20% of more
because contention often begets more contention) that would be in the
same kind of ballpark as the spinlock improvements. And using Michel's
work would be a largely independent scalability improvement on top.

Anybody willing to give it a try? I suspect Rik's benchmark, some
"perf record", and concentrating on just semtimedop() to begin with
would be a fairly good starting point.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/