[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52FE64FA.6040803@hp.com>
Date: Fri, 14 Feb 2014 13:48:26 -0500
From: Waiman Long <waiman.long@...com>
To: Peter Zijlstra <peterz@...radead.org>
CC: linux-kernel@...r.kernel.org, Jason Low <jason.low2@...com>,
mingo@...nel.org, paulmck@...ux.vnet.ibm.com,
torvalds@...ux-foundation.org, tglx@...utronix.de, riel@...hat.com,
akpm@...ux-foundation.org, davidlohr@...com, hpa@...or.com,
andi@...stfloor.org, aswin@...com, scott.norton@...com,
chegu_vinod@...com
Subject: Re: [PATCH 7/8] locking: Introduce qrwlock
On 02/13/2014 11:35 AM, Peter Zijlstra wrote:
> On Tue, Feb 11, 2014 at 03:12:59PM -0500, Waiman Long wrote:
>> Using the same locktest program to repetitively take a single rwlock with
>> programmable number of threads and count their execution times. Each
>> thread takes the lock 5M times on a 4-socket 40-core Westmere-EX
>> system. I bound all the threads to different CPUs with the following
>> 3 configurations:
>>
>> 1) Both CPUs and lock are in the same node
>> 2) CPUs and lock are in different nodes
>> 3) Half of the CPUs are in same node as the lock& the other half
>> are remote
> I can't find these configurations in the below numbers; esp the first is
> interesting because most computers out there have no nodes.
I have a local and remote number in the measurement data that I sent
out. The local ones are when both CPUs and lock are in the same node.
The remote is when they are in different nodes.
>> Two types of qrwlock are tested:
>> 1) Use MCS lock
>> 2) Use ticket lock
> arch_spinlock_t; you forget that if you change that to an MCS style lock
> this one goes along for free.
Yes, I am aware of that. I am not saying that it is a bad idea to use
arch_spin_t. I will be happy if your version of qrwlock patch get
merged. I am just saying that it maybe a better idea to use MCS lock
directly especially in case that the spinlock is not converted to use a
MCS-style lock. I will be more happy if that happen.
>
> On that; I had a look at your qspinlock and got a massive head-ache so I
> rewrote it. Aside from being very mess code it also suffered from a few
> fairness issues in that it is possible (albeit highly unlikely) to steal
> a lock instead of being properly queued; per your xchg() usage.
>
> The below boots; but I've not done much else with it, so it will
> probably explode in your face.
Thank for looking into my qspinlock patch. I will take a look at your
changes and incorporate it to make it more fair. I have already
rewritten it along the same line your version of the qrwlock patch. I
have done some performance testing at low contention level using my
microbenchmark. The qspinlock was indeed slower than ticket lock with
2-4 contending tasks. The break-even point is at 5 contending tasks. To
fix this performance deficit, I added an optimized x86 specific
contention path for 2 contending tasks so that it would perform better
than the ticket lock. It will still be somewhat slower for 3-4
contending tasks, but the 2 contending task case is probably the most
common.
With that change, I would say that my qspinlock patch should be good
enough as a replacement of ticket spinlock for x86. I will send out an
updated qspinlock patch in a day or two when I finish my testing.
-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists