[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50D51E56.4080200@redhat.com>
Date: Fri, 21 Dec 2012 21:43:34 -0500
From: Rik van Riel <riel@...hat.com>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: linux-kernel@...r.kernel.org, aquini@...hat.com, walken@...gle.com,
lwoodman@...hat.com, jeremy@...p.org,
Jan Beulich <JBeulich@...ell.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH 3/3 -v2] x86,smp: auto tune spinlock backoff delay
factor
On 12/21/2012 07:18 PM, Eric Dumazet wrote:
> On Fri, 2012-12-21 at 18:56 -0500, Rik van Riel wrote:
>> Argh, the first one had a typo in it that did not influence
>> performance with fewer threads running, but that made things
>> worse with more than a dozen threads...
>
>> +
>> + /*
>> + * The lock is still busy, the delay was not long enough.
>> + * Going through here 2.7 times will, on average, cancel
>> + * out the decrement above. Using a non-integer number
>> + * gets rid of performance artifacts and reduces oversleeping.
>> + */
>> + if (delay < MAX_SPINLOCK_DELAY &&
>> + ((inc.head & 3) == 0 || (inc.head & 7) == 1))
>> + delay++;
>
> ((inc.head & 3) == 0 || (inc.head & 7) == 1)) seems a strange condition
> to me...
It is. It turned out that doing the increment
every 4 times (just the first check) resulted
in odd performance artifacts when running with
4, 8, 12 or 16 CPUs.
Moving to the above got rid of the performance
artifact.
It also results in aiming for a sleep period
that is not an exact multiple of the lock
acquiring period, which results in less
"oversleeping", and measurably better
performance.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists