lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Apr 2013 07:38:39 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Waiman Long <Waiman.Long@...com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	David Howells <dhowells@...hat.com>,
	Dave Jones <davej@...hat.com>,
	Clark Williams <williams@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"Chandramouleeswaran, Aswin" <aswin@...com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH RFC 1/3] mutex: Make more scalable by doing less atomic operations

On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> AFAICS the main performance trade-off is the following: when the owner CPU unlocks
> the mutex, we'll poll it via a read first, which turns the cacheline into
> shared-read MESI state. Then we notice that its content signals 'lock is
> available', and we attempt the trylock again.
>
> This increases lock latency in the few-contended-tasks case slightly - and we'd
> like to know by precisely how much, not just for a generic '10-100 users' case
> which does not tell much about the contention level.

We had this problem for *some* lock where we used a "read + cmpxchg"
in the hotpath and it caused us problems due to two cacheline state
transitions (first to shared, then to exclusive). It was faster to
just assume it was unlocked and try to do an immediate cmpxchg.

But iirc it is a non-issue for this case, because this is only about
the contended slow path.

I forget where we saw the case where we should *not* read the initial
value, though. Anybody remember?

That said, the MUTEX_SHOULD_XCHG_COUNT macro should die. Why shouldn't
all architectures just consider negative counts to be locked? It
doesn't matter that some might only ever see -1.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ