linux-kernel - Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an arch doesn't support it

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20061207004638.GA24032@linux-mips.org>
Date:	Thu, 7 Dec 2006 00:46:38 +0000
From:	Ralf Baechle <ralf@...ux-mips.org>
To:	Christoph Lameter <clameter@....com>
Cc:	Russell King <rmk+lkml@....linux.org.uk>,
	David Howells <dhowells@...hat.com>, torvalds@...l.org,
	akpm@...l.org, linux-arm-kernel@...ts.arm.linux.org.uk,
	linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an arch doesn't support it

On Wed, Dec 06, 2006 at 11:16:55AM -0800, Christoph Lameter wrote:

> But then its also just requires disable/enable interrupts on UP which may 
> be cheaper than an atomic operation.
> 
> > For CPUs with load locked + store conditional, it is expensive.
> 
> Because it locks the bus? I am not that familiar with those architectures 
> but it seems that those will have a general problem anyways.

On a decent implementation ll/sc will have the same cost as ordinary
non-atomic load and store instructions.  A likely uniprocessor
implementation uses a single flip-flop ("llbit") in the CPU which is set
by ll and cleared by any exception handler, especially interrupt.  A later
store conditional will then simply fail if that bit is cleared.  That
is extremly trivial stuff.  On SMP it's somewhat more complex; A
processor will have to remember the address used with ll and start
snooping the bus for writes to it.  The store conditional will then
go and upgrade the cache line to exclusive state if the llbit is still
set and perform the store.  The llbit would be cleared if the processor
has snooped any other write to the cacheline.  Details are fun but that's
bascially how it's implemented.

Of course load linked / store conditional are typically used in loops
so there is a little extra overhead from that especially where when the
branch is misspredicted.

Also note there is no locked cycle required to implement load linked /
store conditional.

  Ralf
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/