linux-kernel - Re: [PATCH 1/2] FRV: Implement atomic64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A4D2239.5000602@gmail.com>
Date:	Thu, 02 Jul 2009 23:10:17 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	David Howells <dhowells@...hat.com>, mingo@...e.hu,
	akpm@...ux-foundation.org, paulus@...ba.org, arnd@...db.de,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] FRV: Implement atomic64_t

Linus Torvalds a écrit :
> 
> On Wed, 1 Jul 2009, David Howells wrote:
>> +
>> +#define ATOMIC64_INIT(i)	{ (i) }
>> +#define atomic64_read(v)	((v)->counter)
>> +#define atomic64_set(v, i)	(((v)->counter) = (i))
> 
> These seem to be buggy.
> 
> At least "atomic64_read()" needs to make sure to actually read it 
> atomically - otherwise you'll do two 32-bit reads, and that just gets 
> crap. Imagine if somebody is adding 1 to 0x00000000ffffffff, and then 
> "atomic64_read()" reads it as two accesses in the wrong place, and gets 
> either 0, or 0x00000001ffffffff, both of which are totally incorrect.
> 
> The case of 'atomic64_set()' is _slightly_ less clear, because I think we 
> use it mostly for initializers, so atomicity is often not strictly 
> required. But at least on x86, we do guarantee that it sets it atomically 
> too.
> 
> Btw, Ingo: I looked at the x86-32 versions to be sure, and noticed a 
> couple of buglets:
> 
>  - atomic64_xchg uses "atomic_read()". Sure, it happens to work, since 
>    the "atomic_read()" is not type-safe, and gets a non-atomic 64-bit 
>    read, but that looks really really bogus.
> 
>    It _should_ use __atomic64_read(), and the 64-bit versions should use a 
>    different counter name ("counter64"?) or we should use an inline 
>    function for atomic_read(), so that the type safety issue gets fixed.
> 
>  - atomic64_read() is being stupid with the whole loop thing. It _should_ 
>    just do
> 
> 	static inline unsigned long long atomic64_read(atomic64_t *ptr)
> 	{
> 		unsigned long long old = __atomic64_read(ptr);
> 		return cmpxchg8b(ptr, old, old);
> 	}
> 
>    and that's it. No loop. cmpxchg8b() will return the right thing.

Using a fixed initial value (instead of __atomic64_read()) is even faster, 
it apparently permits cpu to use an appropriate bus transaction.

static inline unsigned long long atomic64_read(atomic64_t *ptr)
{
	unsigned long long old = 0LL ;

	return cmpxchg8b(&ptr->counter, old, old);
}

I also rewrote cmpxchg8b() to not use %edi register but a generic "+m" constraint.

static inline unsigned long long
cmpxchg8b(unsigned long long *ptr, unsigned long long old, unsigned long long new)
{
        unsigned long low = new;
        unsigned long high = new >> 32;

        asm volatile(
                LOCK_PREFIX "cmpxchg8b %1\n"
                     :  "+A" (old), "+m" (*ptr)
                     :  "b" (low), "c" (high)
                     );
        return old;
}



I got a 4 x speedup on a dual quad core (Intel E5450) machine if all cpus try 
to *read* the same atomic64 location.

I tried various init value and got additional 5 % speedup chosing a
value *most probably* different than actual atomic64 one,
like (1LL << 32), with nice asm output...

static inline unsigned long long atomic64_read(atomic64_t *ptr)
{
	unsigned long long old = (1LL << 32) ;

	return cmpxchg8b(&ptr->counter, old, old);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/