linux-kernel - [PATCH] x86: atomic64_t should be 8 bytes aligned

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A4D2FDA.3000509@gmail.com>
Date:	Fri, 03 Jul 2009 00:08:26 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	David Howells <dhowells@...hat.com>, mingo@...e.hu,
	akpm@...ux-foundation.org, paulus@...ba.org, arnd@...db.de,
	linux-kernel@...r.kernel.org
Subject: [PATCH] x86: atomic64_t should be 8 bytes aligned

Linus Torvalds a écrit :
> 
> On Thu, 2 Jul 2009, Eric Dumazet wrote:
>> Using a fixed initial value (instead of __atomic64_read()) is even faster, 
>> it apparently permits cpu to use an appropriate bus transaction.
> 
> Yeah, I guess it does a "read-for-write-ownership" and allows the thing to 
> be done as a single cache transaction.
> 
> If we read it first, it will first get the cacheline for shared-read, and 
> then the cmpxchg8b will need to turn it from shared to exclusive.
> 
> Of course, the _optimal_ situation would be if the cmpxchg8b didn't 
> actually do the write at all when the value matches (and all cores could 
> just keep it shared), but I guess that's not going to happen.
> 
> Too bad there is no pure 8-byte read op. Using MMX has too many downsides.
> 
> Btw, your numbers imply that for the atomic64_add_return(), we really 
> would be much better off not reading the original value at all. Again, in 
> that case, we really do want the "read-for-write-ownership" cache 
> transaction, not a read.
> 

I forgot to mention that if atomic64_t uses to cache lines, my test
program is 10x slower.

So we probably need an __attribute__((aligned(8)) on atomic64_t definition
as well, since alignof(atomic64_t) is 4, not 8 (!!!)

#include <stdio.h>

typedef struct {
        unsigned long long counter;
} atomic64_t;

typedef struct {
        unsigned long long __attribute__((aligned(8))) counter;
} atomic64_ta;

struct {
        int a;
        atomic64_t counter;
} s __attribute__((aligned(64)));

struct {
        int a;
        atomic64_ta counter;
} sa __attribute__((aligned(64)));

int main()
{
        printf("alignof(atomic64_t)=%d\n", __alignof__(atomic64_t));
        printf("alignof(atomic64_ta)=%d\n", __alignof__(atomic64_t));
        printf("alignof(unsigned long long)=%d\n", __alignof__(unsigned long long));
        printf("&s.counter=%p\n", &s.counter);
        printf("&sa.counter=%p\n", &sa.counter);
        return 0;
}

$ gcc -O2 -o test test.c ; ./test
alignof(atomic64_t)=4
alignof(atomic64_ta)=4
alignof(unsigned long long)=8
&s.counter=0x80496c4
&sa.counter=0x8049708
$ gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.3.3/configure --enable-languages=c,c++ --prefix=/usr
Thread model: posix
gcc version 4.3.3 (GCC)


[PATCH] x86: atomic64_t should be 8 bytes aligned

LOCKED instructions on two cache lines are painful.
Make sure an atomic64_t is 8bytes aligned.

Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>

diff --git a/arch/x86/include/asm/atomic_32.h b/arch/x86/include/asm/atomic_32.h
index 2503d4e..1927a56 100644
--- a/arch/x86/include/asm/atomic_32.h
+++ b/arch/x86/include/asm/atomic_32.h
@@ -250,7 +250,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 /* An 64bit atomic type */
 
 typedef struct {
-	unsigned long long counter;
+	unsigned long long __attribute__((__aligned__(8))) counter;
 } atomic64_t;
 
 #define ATOMIC64_INIT(val)	{ (val) }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/