lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1306429854.26735.9.camel@e102144-lin.cambridge.arm.com>
Date:	Thu, 26 May 2011 18:10:54 +0100
From:	Will Deacon <will.deacon@....com>
To:	Catalin Marinas <catalin.marinas@....com>
Cc:	Måns Rullgård <mans@...sr.com>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, sam@...nborg.org, ak@...ux.intel.com
Subject: Re: [PATCH] ARM: Do not allow unaligned accesses when
 CONFIG_ALIGNMENT_TRAP

Right... [adding bunch of people to CC],


On Wed, 2011-05-25 at 15:50 +0100, Catalin Marinas wrote:
> 2011/5/25 Måns Rullgård <mans@...sr.com>:
> > Dave Martin <dave.martin@...aro.org> writes:
> >
> >> On Wed, May 25, 2011 at 12:14:08PM +0100, Catalin Marinas wrote:
> >>> On Tue, May 24, 2011 at 06:13:31PM +0100, Dave Martin wrote:
> >>> > On Tue, May 24, 2011 at 04:26:35PM +0100, Catalin Marinas wrote:
> >>> > > BTW, are we sure that the code generated with unaligned accesses is
> >>> > > better? AFAIK, while processors support unaligned accesses, they may
> >>> > > not always be optimal.
> >>> >
> >>> > The code gcc generates to synthesise an unaligned access using aligned
> >>> > accesses is pretty simplistic:
> >>> ...
> >>> > For code which natively needs to read unaligned fields from data structures,
> >>> > I sincerely doubt that the CPU will not beat the above code for efficiency...
> >>> >
> >>> > So if there's code doing unaligned access to data structures for a good
> >>> > reason, building with unaligned access support turned on in the compiler
> >>> > seems a good idea, if that code might performance-critical for anything.
> >>>
> >>> gcc generates unaligned accesses in the the pcpu_dump_alloc_info()
> >>> function. We have a local variable like below (9 bytes):
> >>>
> >>>      char empty_str[] = "--------";
> >>>
> >>> and it looks like other stack accesses are unaligned:
> >>>
> >>> c0082ba0 <pcpu_dump_alloc_info>:
> >>> c0082ba0:   e92d4ff0    push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
> >>> c0082ba4:   e3074118    movw    r4, #28952  ; 0x7118
> >>> c0082ba8:   e24dd04c    sub sp, sp, #76 ; 0x4c
> >>> c0082bac:   e34c402a    movt    r4, #49194  ; 0xc02a
> >>> c0082bb0:   e58d1014    str r1, [sp, #20]
> >>> c0082bb4:   e58d0020    str r0, [sp, #32]
> >>> c0082bb8:   e8b40003    ldm r4!, {r0, r1}
> >>> c0082bbc:   e58d003f    str r0, [sp, #63]   <----------- !!!!!
> >>> c0082bc0:   e59d0014    ldr r0, [sp, #20]
> >>> c0082bc4:   e5d43000    ldrb    r3, [r4]
> >>>
> >>> I haven't tried with -mno-unaligned-access, I suspect the variables on
> >>> the stack would be aligned.
> >>
> >> So, it looks like empty_str may be misaligned on the stack, and the compiler
> >> is doing a misaligned store when initialising it.
> >
> > empty_str has type char[] so there are no alignment requirements.
> 
> I think the local variables after char empty_str[] are unaligned (int
> alloc). Changing the array size to 16 solves the issue.
> 
> The gcc guys here in ARM will have a look and I'll get back to you.

This issue seems to be caused by passing -fconserve-stack to GCC. This
was added in 8f7f5c9f ("kbuild: set -fconserve-stack option for gcc
4.5") and as you can see from the archive:

http://lkml.org/lkml/2009/9/20/39

it was thought to only have an impact on inlining decisions. Looking at
the documentation for GCC 4.6:

-fconserve-stack
          Attempt to minimize stack usage. The compiler will attempt to
use less stack space, even if that makes the program slower. This option
implies setting the ‘large-stack-frame’ parameter to 100 and the
‘large-stack-frame-growth’ parameter to 400.

So it sounds like we might not want to enable this blindly across all
architectures. Indeed, on ARM, it encourages the compiler to pack
variables on the stack which leads to the weird and wonderful alignment
situation that has been encountered in this thread.

Can we remove -fconserve-stack from the top-level Makefile (or at least
make it conditional by architecture)?

Will

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ