[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a2XjZ6ZfHVu8gbAQ2ncj+GTR30Zj6oKUG8CTH4iswOnTw@mail.gmail.com>
Date: Fri, 8 Sep 2017 22:06:28 +0200
From: Arnd Bergmann <arnd@...db.de>
To: Ard Biesheuvel <ard.biesheuvel@...aro.org>
Cc: Romain Izard <romain.izard.pro@...il.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
LKML <linux-kernel@...r.kernel.org>,
Sven Schmidt <4sschmid@...ormatik.uni-hamburg.de>
Subject: Re: HAVE_EFFICIENT_UNALIGNED_ACCESS on ARM32 (was: Alignment issues
in zImage with Linux 4.12, LZ4 and GCC5.3)
On Thu, Sep 7, 2017 at 1:31 AM, Ard Biesheuvel
<ard.biesheuvel@...aro.org> wrote:
> On 7 September 2017 at 00:18, Arnd Bergmann <arnd@...db.de> wrote:
>> On Thu, Sep 7, 2017 at 12:48 AM, Ard Biesheuvel
>> I see lots of unaligned helpers in the lz4 code, is this not what
>> we hit?
>>
>> $ git grep unaligned lib/
>> lib/lz4/lz4_compress.c:#include <asm/unaligned.h>
>> lib/lz4/lz4_decompress.c:#include <asm/unaligned.h>
>> lib/lz4/lz4defs.h:#include <asm/unaligned.h>
>> lib/lz4/lz4defs.h: return get_unaligned((const U16 *)ptr);
>> lib/lz4/lz4defs.h: return get_unaligned((const U32 *)ptr);
>> lib/lz4/lz4defs.h: return get_unaligned((const size_t *)ptr);
>> lib/lz4/lz4defs.h: put_unaligned(value, (U16 *)memPtr);
>>
>
> Yes, you are right. The code I looked at before does cast a char* to a
> U32*, but it is in the compression path, so it has nothing to do with
> this issue.
>
> So I agree that access_ok.h is unsuitable for any 32-bit ARM core, and
> we should be using the struct version instead. My only remaining
> question is why we need access_ok.h in the first place: it is worth a
> try to check whether both produce the same code on AArch64.
It's been a while since I looked into this problem, but from my memory,
it turned out rather hard to analyze single files after the change, as
gcc inlining decisions and register allocation tend to be non-deterministic.
However, my conclusion then was that those changes are rather random,
usually no effect, sometimes better and sometimes worse by chance,
with the only real differences being the few cases we avoid the ldrd/ldm/...
instructions.
I have no idea which compiler version I tried back then, so it's very
possible that some older compilers actually do produce slightly worse
code with the struct version, the question is what the oldest compiler
is that we care about enough to investigate. Maybe gcc-4.8?
Arnd
Powered by blists - more mailing lists