linux-kernel - Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFx6N6SaRN+w=CTXJ=ZSsOE6KeS2R9sT_uhAx9h2j+hjKg@mail.gmail.com>
Date:	Wed, 13 Feb 2013 08:20:42 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Rik van Riel <riel@...hat.com>, rostedt@...dmiss.org,
	aquini@...hat.com, Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Michel Lespinasse <walken@...gle.com>
Cc:	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock
 out of line

On Wed, Feb 13, 2013 at 4:06 AM, tip-bot for Rik van Riel
<riel@...hat.com> wrote:
>
> x86/smp: Move waiting on contended ticket lock out of line
>
> Moving the wait loop for congested loops to its own function
> allows us to add things to that wait loop, without growing the
> size of the kernel text appreciably.

Did anybody actually look at the code generation of this?

Adding an external function call is *horrible*, and you might almost
as well just uninline the spinlock entirely if you do this. It means
that all the small callers now have their registers trashed, whether
the unlikely function call is taken or not, and now leaf functions
aren't leaves any more.

Guys, the biggest cost of a function call is not the "call"
instruction, it's all the crap around it - trashed registers causing
us to have to have a stack frame in the caller when none was required
otherwise, fixed register allocation by the arguments/return value etc
etc. So having inline functions that then call another function means
that the inline is almost entirely pointless (unless it was just to
set up default arguments or something).

The call is likely also about the same size as the code in the wait
loop. And the excuse is "so that we can add stuff to the wait loop".
What the f*ck? There's nothing to do in the wait-loop, and if you want
debugging or statistis or something, the spinlock should have been
out-of-line.

This is apparently for the auto-tuning, which came with absolutely no
performance numbers (except for the *regressions* it caused), and
which is something we have actively *avoided* in the past, because
back-off is a f*cking idiotic thing, and the only real fix for
contended spinlocks is to try to avoid the contention and fix the
caller to do something smarter to begin with.

In other words, the whole f*cking thing looks incredibly broken. At
least give some good explanations for why crap like this is needed,
instead of just implementing backoff without even numbers for real
loads. And no, don't bother to give numbers for pointless benchmarks.
It's easy to get contention on a benchmark, but spinlock backoff is
only remotely interesting on real loads.

                      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/