linux-kernel - Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an arch doesn't support it

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0612081101280.3516@woody.osdl.org>
Date:	Fri, 8 Dec 2006 11:15:58 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	Christoph Lameter <clameter@....com>
cc:	Russell King <rmk+lkml@....linux.org.uk>,
	David Howells <dhowells@...hat.com>,
	Nick Piggin <nickpiggin@...oo.com.au>, akpm@...l.org,
	linux-arm-kernel@...ts.arm.linux.org.uk,
	linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an arch
 doesn't support it

On Fri, 8 Dec 2006, Christoph Lameter wrote:
> 
> As also shown in this thread: There are restrictions on what you can do 
> between ll/sc

This, btw, is almost certainly true on ARM too.

There are three major reasons for restrictions on ll/sc:

 - bus-cycle induced things (eg variations of "you cannot do a store in 
   between the ll and the sc, because it will touch the cache and clear 
   the bit", where "the store" might be a load too, and "the cache" might
   be just "the bus interface")

 - trap handling usually clears the internal lock bit too, which means 
   that depending on the micro-architecture, even internal microtraps 
   (like even just branch misprediction, but more commonly things like TLB 
   misses etc) can cause a sc to always fail.

 - timing. Livelock in particular.

The last one is the one that hits everybody, regardless of 
microarchitecture. The rule may be that the LL/SC need to be within a 
certain number of cycles (which can be very small - like ten) in order to 
guarantee that the cacheline can't be stolen. 

All of which means that _nobody_ can really do this reliably in C. Even if 
there are no other microarchitectural rules (and it sounds like that might 
be true on ARM), the timing issue means that you can _still_ only use it 
for very specific and simple sequences, and trying to expose it as a 
higher-level thing is not going to work in general for anything even 
remotely complicated.

(The timing may also mean that you end up having to do random back-off 
etc, just to make sure _somebody_ makes progress. Ie it might not be a 
matter of "within ten cycles", but "you need to randomize the timing").

In other words, it's simply not an option to expose LL/SC as an interface. 
It would be VERY convenient to do, since cmpxchg can emulate ll/sc (the 
"ll" part is a normal load, the "sc" part is a "compare that the old value 
still matches, and store the new one if so"). But because you can't expose 
LL/SC anyway in any reasonably portable way, that just doesn't work.

So, you really do end up with three possibilities:

 - do things with TRULY PORTABLE interfaces. And like it or not, cmpxchg 
   is the closest thing you can get to that. It's trivial to do cmpxchg 
   using ll/sc (modulo the "random backoff part" if you need it, which is 
   still pretty simple, but no longer totally trivial), and architectures 
   that have neither ll/sc _nor_ a native cmpxchg can just go screw 
   themselves with spinlocks - they really aren't worth worrying about in 
   SMP. At some point you have to tell hardware designers that their 
   hardware just sucks.

 - have ugly conditional code in generic code. I personally think this is 
   a _much_ worse option in most cases.

 - have a much higher-level interface and make it _all_ architecture- 
   dependent (possibly with a "generic" version for sane architectures). 
   This works, but the more high-level it is, the more you end up having 
   the same thign written in many different ways, and nasty maintenance.

   So we generally set the bar pretty low. Things like semaphore locking 
   primitives are high-level enough already that we prefer to try to make 
   them use common lower-level interfaces (spinlocks, cmpxchg etc). 
   Something like kernel/workqueue.c is _way_ too high a level to do 
   arch-specific.

So right now, I think the "cmpxchg" or the "bitmask set" approach are the 
alternatives. Russell - LL/SC simply isn't on the table as an interface, 
whether you like it or not.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/