lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55BAD78A.1000308@hp.com>
Date:	Thu, 30 Jul 2015 22:03:54 -0400
From:	Waiman Long <waiman.long@...com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, mingo@...nel.org,
	jiangshanlai@...il.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
	dhowells@...hat.com, edumazet@...gle.com, dvhart@...ux.intel.com,
	fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com,
	dave@...olabs.net
Subject: Re: [PATCH tip/core/rcu 19/19] rcu: Add fastpath bypassing funnel
 locking

On 07/30/2015 10:44 AM, Peter Zijlstra wrote:
> On Fri, Jul 17, 2015 at 04:29:24PM -0700, Paul E. McKenney wrote:
>
>>   	/*
>> +	 * First try directly acquiring the root lock in order to reduce
>> +	 * latency in the common case where expedited grace periods are
>> +	 * rare.  We check mutex_is_locked() to avoid pathological levels of
>> +	 * memory contention on ->exp_funnel_mutex in the heavy-load case.
>> +	 */
>> +	rnp0 = rcu_get_root(rsp);
>> +	if (!mutex_is_locked(&rnp0->exp_funnel_mutex)) {
>> +		if (mutex_trylock(&rnp0->exp_funnel_mutex)) {
>> +			if (sync_exp_work_done(rsp, rnp0, NULL,
>> +					&rsp->expedited_workdone0, s))
>> +				return NULL;
>> +			return rnp0;
>> +		}
>> +	}
> So our 'new' locking primitives do things like:
>
> static __always_inline int queued_spin_trylock(struct qspinlock *lock)
> {
>          if (!atomic_read(&lock->val)&&
>             (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) == 0))
>                  return 1;
>          return 0;
> }
>
> mutexes do not do this.
>
> Now I suppose the question is, does that extra read slow down the
> (common) uncontended case? (remember, we should optimize locks for the
> uncontended case, heavy lock contention should be fixed with better
> locking schemes, not lock implementations).

I suppose the extra read may slow down the uncontended case, but I am 
not sure by how much as I haven't run any test to quantify this. 
However, there are use cases where it is advantageous to do a read 
first, like when the lock cacheline is likely to be hot (in the 
slowpath, for example). So it depends on how the trylock is being used.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ