lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jan 2014 17:00:56 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Jason Low <jason.low2@...com>
Cc:	mingo@...hat.com, peterz@...radead.org, paulmck@...ux.vnet.ibm.com,
	Waiman.Long@...com, torvalds@...ux-foundation.org,
	tglx@...utronix.de, linux-kernel@...r.kernel.org, riel@...hat.com,
	davidlohr@...com, hpa@...or.com, aswin@...com, scott.norton@...com
Subject: Re: [RFC 3/3] mutex: When there is no owner, stop spinning after
 too many tries

On Tue, 14 Jan 2014 16:33:10 -0800 Jason Low <jason.low2@...com> wrote:

> When running workloads that have high contention in mutexes on an 8 socket
> machine, spinners would often spin for a long time with no lock owner.
> 
> One of the potential reasons for this is because a thread can be preempted
> after clearing lock->owner but before releasing the lock, or preempted after
> acquiring the mutex but before setting lock->owner. In those cases, the
> spinner cannot check if owner is not on_cpu because lock->owner is NULL.

That sounds like a very small window.  And your theory is that this
window is being hit sufficiently often to impact aggregate runtime
measurements, which sounds improbable to me?

> A solution that would address the preemption part of this problem would
> be to disable preemption between acquiring/releasing the mutex and
> setting/clearing the lock->owner. However, that will require adding overhead
> to the mutex fastpath.

preempt_disable() is cheap, and sometimes free.

Have you confirmed that the preempt_disable() approach actually fixes
the performance issues?  If it does then this would confirm your
"potential reason" hypothesis.  If it doesn't then we should be hunting
further for the explanation.

> The solution used in this patch is to limit the # of times thread can spin on
> lock->count when !owner.
> 
> The threshold used in this patch for each spinner was 128, which appeared to
> be a generous value, but any suggestions on another method to determine
> the threshold are welcomed.

It's a bit hacky, isn't it?  If your "owner got preempted in the
window" theory is correct then I guess this is reasonableish.  But if
!owner is occurring for other reasons then perhaps there are better
solutions.  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists