[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <5770FDBE.6000800@linux.vnet.ibm.com>
Date: Mon, 27 Jun 2016 18:19:42 +0800
From: xinhui <xinhui.pan@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>,
panxinhui <xinhui@...ux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@...il.com>, linux-kernel@...r.kernel.org,
mingo@...hat.com, dave@...olabs.net, will.deacon@....com,
Waiman.Long@....com, benh@...nel.crashing.org
Subject: Re: [PATCH] locking/osq: Drop the overload of osq lock
On 2016年06月27日 15:55, Peter Zijlstra wrote:
> On Sun, Jun 26, 2016 at 12:59:01PM +0800, panxinhui wrote:
>>
>>> 在 2016年6月26日,03:12,Peter Zijlstra <peterz@...radead.org> 写道:
>>>
>>> On Sun, Jun 26, 2016 at 01:27:51AM +0800, panxinhui wrote:
>>>
>>>> by the way I still think mutex_unlock has a big overload too.
>>>
>>> Do you mean overhead?
>>>
>> oh, maybe you are right.
>
>> mutex_unlock ’s implementation uses inc_return variant on ppc, and
>> that’s expensive. I am thinking of using cmpxchg instead.
>
> That statement doesn't make any sense. PPC is an LL/SC arch, inc_return
> and cmpxchg are the 'same' LL/SC loop.
>
This is a little optimize.
if there are lock waiters, the lockval is minus X, when we call unlock, it will inc the lockval, if it is <= 0, enter unlockslowpath to wakeup the waiters, and set lockval to 1 in the slowpath.
SO there is no need to inc lockval if it is already a minus number. therefore we can save one store or loads/stores in LL/SC loops
the base idea is from code below,
if (!atomic_read(&lk)//no need to call atomic_inc_return which is expensive.
atomic_inc_return(&lk))
Powered by blists - more mailing lists