linux-kernel - Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0612102040490.12500@woody.osdl.org>
Date:	Sun, 10 Dec 2006 20:49:31 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	linux@...izon.com
cc:	nickpiggin@...oo.com.au, linux-arch@...r.kernel.org,
	linux-arm-kernel@...ts.arm.linux.org.uk,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an

On Sun, 10 Dec 2006, linux@...izon.com wrote:
> 
> While I agree that LL/SC can't be part of the kernel API for people to
> get arbitrarily clever with in the device driver du jour, they are *very*
> nice abstractions for shrinking the arch-specific code size.

I'm not sure. 

The thing is, it's _really_ hard to tell the compiler to not reload values 
from memory in between two inline asm statements.

So what you easily end up with is 
 (a) yes, you can actually get the compiler to generate the "obvious" code 
     sequence 99% of the time, and it will all work fine.
 (b) but it's really hard to actually guarantee it, and some subtle things 
     can really mess you up.

An example of (b) is how we actually put some of these atomic data 
structures on the stack ("struct completion" comes to mind), and it can 
get really interesting it something works in all the tests, but then 
subtly breaks on some microarchitectures when the data structures happen 
to be on the stackjust because the compiler happened to do a register 
reload to the stack at the wrong point.

Now, if you don't inline any of these things, you can control things a lot 
better, since then you end up having a much smaller set of circumstances, 
and you never have code "around" the actual operation that changes things 
like register reload. And yes, I do think that it might be possible to 
have some kind of generic "ll/sc template" setup for that case. You can 
often make gcc generate the code you want, especially if there is no real 
register pressure and you can keep the code simple.

> The semantics are widely enough shared that it's quite possible in
> practice to write a good set of atomic primitives in terms of LL/SC
> and then let most architectures define LL/SC and simply #include the
> generic atomic op implementations.

Well, you do have to also realize that the architectures that dont' do 
ll/sc do end up limiting the number of useful primitives, especially 
considering that we know that some architectures simply cannot do a lot 
between them (which _also_ limits it).

I think we've ended up implementing most of the common ones. We have a 
fairly big set of ops like "atomic_add_return()"-like operations, and 
those are the obvious ones that can be done for _any_ ll/sc architecture 
too. So I don't think there's all that much more to be had there.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/