[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52EF7762.2060809@imgtec.com>
Date: Mon, 3 Feb 2014 11:02:58 +0000
From: James Hogan <james.hogan@...tec.com>
To: David Laight <David.Laight@...LAB.COM>
CC: 'Dan Carpenter' <dan.carpenter@...cle.com>,
Chen Gang <gang.chen.5i5j@...il.com>,
"devel@...verdev.osuosl.org" <devel@...verdev.osuosl.org>,
"andreas.dilger@...el.com" <andreas.dilger@...el.com>,
Antonio Quartulli <antonio@...hcoding.com>,
"Greg KH" <gregkh@...uxfoundation.org>,
"bergwolf@...il.com" <bergwolf@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
David Miller <davem@...emloft.net>,
"oleg.drokin@...el.com" <oleg.drokin@...el.com>,
"jacques-charles.lafoucriere@....fr"
<jacques-charles.lafoucriere@....fr>,
"jinshan.xiong@...el.com" <jinshan.xiong@...el.com>,
netdev <netdev@...r.kernel.org>,
"linux-metag@...r.kernel.org" <linux-metag@...r.kernel.org>
Subject: Re: [PATCH] drivers: staging: lustre: lustre: include: add "__attribute__((packed))"
for the related union
On 03/02/14 10:35, David Laight wrote:
> From: James Hogan
>> On 03/02/14 10:05, David Laight wrote:
>>> Architectures that define such alignment rules are a right PITA.
>>> You either need to get the size to 2 without using 'packed', or
>>> just not define such structures.
>>> It is worth seeing if adding aligned(2) will change the size - I'm
>>> not sure.
>>
>> __aligned(2) alone doesn't seem to have any effect on sizeof() or
>> __alignof__() unless it is accompanied by __packed. x86_64 is similar in
>> that respect (it just packs sanely in the first place).
>>
>> Combining __packed with __aligned(2) does the trick though (__packed
>> alone sets __aligned(1) which is obviously going to be suboptimal).
>
> Compile some code for a cpu that doesn't support misaligned transfers
> (probably one of sparc, arm, ppc) and see if the compiler generates a
> single 16bit request or two 8 bits ones.
> You don't want the compiler generating multiple byte-sized memory transfers.
Meta is also one of those arches, and according to my quick tests,
__packed alone does correctly make it fall back to byte loads/stores,
but with __packed __aligned(2) it uses 16bit loads/stores. I've also
confirmed that with an ARM toolchain (see below for example).
Cheers
James
input:
#define __aligned(x) __attribute__((aligned(x)))
#define __packed __attribute__((packed))
union a {
short x, y;
} __aligned(2) __packed;
struct b {
short x;
} __aligned(2) __packed;
unsigned int soa = sizeof(union a);
unsigned int aoa = __alignof__(union a);
unsigned int sob = sizeof(struct b);
unsigned int aob = __alignof__(struct b);
void t(struct b *x, union a *y)
{
++x->x;
++y->x;
}
ARM output (-O2):
.cpu arm10tdmi
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.file "alignment4.c"
.text
.align 2
.global t
.type t, %function
t:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldrh r3, [r0, #0]
add r3, r3, #1
strh r3, [r0, #0] @ movhi
ldrh r3, [r1, #0]
add r3, r3, #1
strh r3, [r1, #0] @ movhi
bx lr
.size t, .-t
.global aob
.global sob
.global aoa
.global soa
.data
.align 2
.type aob, %object
.size aob, 4
aob:
.word 2
.type sob, %object
.size sob, 4
sob:
.word 2
.type aoa, %object
.size aoa, 4
aoa:
.word 2
.type soa, %object
.size soa, 4
soa:
.word 2
.ident "GCC: (GNU) 4.7.1 20120606 (Red Hat 4.7.1-0.1.20120606)"
.section .note.GNU-stack,"",%progbits
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists