[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a3KLjuntMkJW3xkgO5a-nTBVZm8LJVnX-RwsWJ084kuOw@mail.gmail.com>
Date: Fri, 22 Sep 2017 21:17:28 +0200
From: Arnd Bergmann <arnd@...db.de>
To: Joe Perches <joe@...ches.com>
Cc: Colin Ian King <colin.king@...onical.com>,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Sven Schmidt <4sschmid@...ormatik.uni-hamburg.de>,
Andrew Morton <akpm@...ux-foundation.org>,
kernel-janitors@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@...ches.com> wrote:
> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>> text data bss dec hex filename
>> 18220 176 0 18396 47dc build/tmp/lib/lz4/lz4_decompress-after.o
>> 22297 0 0 22297 5719 build/tmp/lib/lz4/lz4_decompress-before.o
>
> Perhaps not so much a gcc bug as an opportunity
> for gcc to add an additional optimization.
>
> gcc would have to verify that the const array is
> not initialized with some variable or argument like:
>
> int foo(int a)
> {
> const int array[] = {1, a};
> ...
> }
It depends. With a 10KB different in .text size, my guess is that this
is a case where gcc does the right optimization in principle, but
fails to do what was intended in some corner cases.
I just cross-checked by building with clang, there the patch has
no impact on code size, it is 24929 bytes with or without the patch.
Looking at other versions of (x86) gcc, I see .text sizes of
after before
gcc-3.4.6 10855 12977
gcc-4.0.4 11088 11088
gcc-4.1.3 10973 10973
gcc-4.2.5 11183 11183
gcc-4.3.6 15501 17724
gcc-4.4.7 13337 15693
gcc-4.5.4 13162 15491
gcc-4.6.4 14846 17302
gcc-4.7.4 14187 16294
gcc-4.8.5 16591 18730
gcc-4.9.4 19582 21995
gcc-5.4.1 18294 22510
gcc-6.1.1 20487 25172
gcc-6.3.1 20487 25172
gcc-7.0.0 20351 31789
gcc-7.0.1 20351 24966
gcc-7.1.1 20383 24982
gcc-8.0.0 20686 25065
It seems whatever happened in early versions of gcc-7 has since
improved, and it probably was a bug since older and newer versions
create similar code size (I have not looked at the actual object code).
The 5K difference in gcc-5 and higher still seems like a lot. It would
also be interesting to look at the decompression performance of
this code witth the different compilers to see if it got better or worse.
Most likely, gcc got better at inlining and unrolling parts of the
algorithm, but sometimes an object file that doubles or triples in
size is an indication that the compiler did something really bad.
Arnd
Powered by blists - more mailing lists