[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A4D9FC4.1070201@gmail.com>
Date: Fri, 03 Jul 2009 08:05:56 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: unlisted-recipients:; (no To-header on input)
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
David Howells <dhowells@...hat.com>, mingo@...e.hu,
akpm@...ux-foundation.org, paulus@...ba.org, arnd@...db.de,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] FRV: Implement atomic64_t
Eric Dumazet a écrit :
> I got a 4 x speedup on a dual quad core (Intel E5450) machine if all cpus try
> to *read* the same atomic64 location.
>
> I tried various init value and got additional 5 % speedup chosing a
> value *most probably* different than actual atomic64 one,
> like (1LL << 32), with nice asm output...
>
> static inline unsigned long long atomic64_read(atomic64_t *ptr)
> {
> unsigned long long old = (1LL << 32) ;
>
> return cmpxchg8b(&ptr->counter, old, old);
> }
>
My last suggestion would be :
static inline unsigned long long atomic64_read(const atomic64_t *ptr)
{
unsigned long long res;
asm volatile(
"mov %%ebx, %%eax\n\t"
"mov %%ecx, %%edx\n\t"
LOCK_PREFIX "cmpxchg8b %1\n"
: "=A" (res)
: "m" (*ptr)
);
return res;
}
ebx/ecx being read only, and their value can be random, they are not even
mentioned in asm constraints, so gcc is allowed to keep useful values
in these registers.
So the following (stupid) example
for (i = 0; i < 10000000; i++) {
res += atomic64_read(&myvar);
}
gives :
xorl %esi, %esi
.L2:
mov %ebx, %eax
mov %ecx, %edx
lock;cmpxchg8b myvar
addl %eax, %ecx
adcl %edx, %ebx
addl $1, %esi
cmpl $10000000, %esi
jne .L2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists