[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <511C24A6.8020409@redhat.com>
Date: Wed, 13 Feb 2013 18:41:26 -0500
From: Rik van Riel <riel@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Ingo Molnar <mingo@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>, rostedt@...dmiss.org,
aquini@...hat.com, Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Michel Lespinasse <walken@...gle.com>,
linux-tip-commits@...r.kernel.org
Subject: Re: [tip:core/locking] x86/smp: Move waiting on contended ticket
lock out of line
On 02/13/2013 05:40 PM, Linus Torvalds wrote:
> On Wed, Feb 13, 2013 at 2:21 PM, Rik van Riel <riel@...hat.com> wrote:
>>
>> What kind of numbers would you like?
>>
>> Numbers showing that the common case is not affected by this
>> code?
>>
>> Or numbers showing that performance of something is improved
>> with this code?
>>
>> Of course, the latter would point out a scalability issue,
>> that may be fixable by changing the data structures and code
>> involved :)
>
> Yes.
>
> To all three.
I will run a bunch of tests to demonstrate there are no
regressions with this code. You'll have to wait a few days
before I have a good set of numbers for that.
I have an example of the second case. It is a test case
from a customer issue, where an application is contending on
semaphores, doing semaphore lock and unlock operations. The
test case simply has N threads, trying to lock and unlock the
same semaphore.
The attached graph (which I sent out with the intro email to
my patches) shows how reducing the memory accesses from the
spinlock wait path prevents the large performance degradation
seen with the vanilla kernel. This is on a 24 CPU system with
4 6-core AMD CPUs.
The "prop-N" series are with a fixed delay proportional back-off.
You can see that a small value of N does not help much for large
numbers of cpus, and a large value hurts with a small number of
CPUs. The automatic tuning appears to be quite robust.
>> If we have just a few CPUs bouncing the cache line around, we
>> have no real performance issues and the loop with cpu_relax
>> seems to be doing fine.
>>
>> Real issues arise when we have a larger number of CPUs contending
>> on the same lock. At that point we know we are no longer bouncing
>> between sibling cores.
>
> Again. Exactly my point.
>
> So you're potentially making things worse for just about everybody, in
> order to fix a problem that almost nobody can actually see. And
> apparently you don't even know the problem.
If we have only a few CPUs contending on the lock, the delays
will be short. Furthermore, the CPU at the head of the queue
will run the old spinlock code with just cpu_relax() and checking
the lock each iteration.
>> This patch series is not as much for the spinlocks we know about
>> (we can probably fix those), but to prevent catastrophic
>> performance degradation when users run into contention on spinlocks
>> we do NOT know about.
>
> .. and as I already explained, THAT IS PURE AND UTTER BULLSHIT.
>
> It may make things WORSE. On all the things you haven't run to check
> that it does better.
>
> You just stated something that is not at all necessarily true. You
> changed the spinlock behavior, and then you blindly state that it will
> "fix" things on loads you haven't even tested.
I did not claim it will fix things. I claim that it helps reduce
the excessive cross-cpu memory accesses (from the spinlock acquisition
path) that can cause catastrophic performance degradation.
This happens to be what I, Michel and Eric have observed.
Eric got a 45% increase in network throughput, and I saw a factor 4x
or so improvement with the semaphore test. I realize these are not
"real workloads", and I will give you numbers with those once I have
gathered some, on different systems.
> For all we know, it goes exactly the other way around, and makes some
> unknown load perform much much worse, because it turns something that
> didn't *use* to be contended into a contended case, because the
> spinlock is slower on some piece of hardware (or under
> virtualization).
>
> We already saw one virtual load regress quite noticeably, didn't we?
The cause of that was identified (with pause loop exiting, the host
effectively does the back-off for us), and the problem is avoided
by limiting the maximum back-off value to something small on
virtual guests.
> And yet you go on to say that it will fix performance problems THAT WE
> DON'T EVEN KNOW ABOUT! After seeing *proof* to the exact contrary
> behavior! What f*cking planet are you from, again?
>
> Christ, that's hubris.
I realize you do not have time to read all the email discussions
around these patches, but please do not assume that everybody else
are drooling idiots who are unable to log into their computers in
the morning without their guide dogs typing their passwords for them.
I will try to get you some performance test numbers on various kinds
of hardware over the next few days (probably early next week, depending
on hardware availability in the lab), running some mix of benchmarks
and workloads.
> Besides, out-of-line spinlock loops are *horrible* for performance
> monitoring. One of the *best* things about inlining the spinlock
> looping code is that it's so very obvious where a hot lock is used. It
> shows up as a shining beacon in profiles, without any need for
> callchains etc. So not only don't we know what loads it would improve,
> but those unknown loads would also be much harder to figure out.
Are there significant cases where "perf -g" is not easily available,
or harmful to tracking down the performance issue?
Even with inlined spinlocks, you have the issue that they tend to
be taken from functions that can be called from multiple places.
At least, in the parts of the kernel I tend to look at...
> Seriously. I'm going to NAK these kinds of changes unless you can
> actually show real improvement on real loads. No more of this "in
> theory this will fix all our problems" crap.
I will get you numbers. Please hang on while I try to reserve
various lab systems, run programs on them, etc...
--
All rights reversed
Download attachment "spinlock-backoff-v2.png" of type "image/png" (21964 bytes)
Powered by blists - more mailing lists