lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <553FD3B6.3080909@ezchip.com>
Date:	Tue, 28 Apr 2015 14:38:46 -0400
From:	Chris Metcalf <cmetcalf@...hip.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Manfred Spraul <manfred@...orfullife.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Kirill Tkhai <ktkhai@...allels.com>,
	<linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>,
	Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH 2/2] [PATCH] sched: Add smp_rmb() in task rq locking cycles

On 04/28/2015 02:24 PM, Peter Zijlstra wrote:
> A few questions:
> On Tue, Apr 28, 2015 at 02:00:48PM -0400, Chris Metcalf wrote:
>
>>> static inline void arch_spin_lock(arch_spinlock_t *lock)
>>> {
>>> 	unsigned short head, tail;
>>>
>>> 	___tns_lock(&lock->lock); /* XXX does the TNS imply a ___sync?  */
> Does it? Something needs to provide the ACQUIRE semantics.

Yes, __insn_xxx() is a compiler barrier on tile.

Tile architectures do not need any hardware-level "acquire" semantics
since normally control dependency is sufficient for lock acquisition.
Loads and stores are issued in-order into the mesh network, but issued
loads don't block further instruction issue until a register read dependency
requires it.  There is no speculative execution.


>>> 	head = lock->head;
>>> 	lock->head++;
>>> 	___tns_unlock(&lock->lock);
>>>
>>> 	while (READ_ONCE(lock->tail) != head)
>>> 		cpu_relax();
>>> }
>>>
>>> static inline void arch_spin_unlock(arch_spinlock_t *lock)
>>> {
>>> 	/*
>>> 	 * can do with regular load/store because the lock owner
>>> 	 * is the only one going to do stores to the tail
>>> 	 */
>>> 	unsigned short tail = READ_ONCE(lock->tail);
>>> 	smp_mb(); /* MB is stronger than RELEASE */
> Note that your code uses wmb(), wmb is strictly speaking not correct,
> as its weaker than RELEASE.
>
> _However_ it doesn't make any practical difference since all three
> barriers end up emitting __sync() so its not a bug per se.

Yes, this code dates back to 2.6.18 or so; today I would use
smp_store_release().  I like the trend in the kernel to move more
towards the C11 memory order model; I think it will help a lot.

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ