lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8552376s-o19r-3775-6917-p8oq181oosq6@syhkavp.arg>
Date:   Tue, 9 Mar 2021 11:49:16 -0500 (EST)
From:   Nicolas Pitre <nico@...xnic.net>
To:     Masahiro Yamada <masahiroy@...nel.org>
cc:     Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
        Christoph Hellwig <hch@....de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jessica Yu <jeyu@...nel.org>,
        Sami Tolvanen <samitolvanen@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of
 CONFIG_TRIM_UNUSED_KSYMS

On Tue, 9 Mar 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre <nico@...xnic.net> wrote:
> >
> > If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> > That comes with the feature.
> 
> This patch set intends to change this.
> TRIM_UNUSED_KSYMS will build without additional cost,
> like LD_DEAD_CODE_DATA_ELIMINATION.

OK... I do see how you're going about it.

> > > Modules are relocatable ELF.
> > > Clang LTO cannot eliminate any code.
> > > GCC LTO does not work with relocatable ELF
> > > in the first place.
> >
> > I don't think I follow you here. What relocatable ELF has to do with LTO?
> 
> What is important is,
> GCC LTO is the feature of gcc, not binutils.
> That is, LD_FINAL is $(CC).

Exact.

> GCC LTO can be implemented for the final link stage
> by using $(CC) as the linker driver.
> Then, it can determine which code is unreachable.
> In other words, GCC LTO works only when building
> the final executable.

Yes. And it does so by filling .o files with its intermediate code 
representation and not ELF code.

> On the other hand, a relocatable ELF is created
> by $(LD) -r by combining some objects together.
> The relocatable ELF can be fed to another $(LD) -r,
> or the final link stage.

You still can create relocatable ELF using LTO. But LTO stops there. 
>From that point on, .o files will no longer contain data that LTO can 
use if you further combine those object files together. But until that 
point, LTO is still usable.

> As I said above, modules are created by $(LD) -r.
> It is not possible to implement GCC LTO for modules.

If I remember correctly (that was a while ago) the problem with LTO and 
the kernel had to do with the fact that avery subdirectory was gathering 
object files in built-in.o using ld -r. At some point we switched to 
gathering object files into built-in.a files where no linking is taking 
place. The real linking happens in vmlinux.o where LTO may now do its 
magic.

The same is true for modules. Compiling foo_module.c into foo_module.o 
will create a .o file with LTO data rather than executable code. But 
when you create the final .o for the module then LTO takes place and 
produce the relocatable ELF executable.

> > I've successfully used gcc LTO on the kernel quite a while ago.
> >
> > For a reference about binary size reduction with LTO and
> > CONFIG_TRIM_UNUSED_KSYMS please read this article:
> >
> > https://lwn.net/Articles/746780/
> 
> Thanks for the great articles.
> 
> Just for curiosity, I think you used GCC LTO from
> Andy's GitHub.

Right. I provided the reference in the preceding article:
https://lwn.net/Articles/744507/ 

> In the article, you took stm32_defconfig as an example,
> but ARM does not select ARCH_SUPPORTS_LTO.
> 
> Did you add some local hacks to make LTO work
> for ARM?

Of course. This article was written in 2017 and no LTO support at all 
was in mainline back then. But, besides adding CONFIG_LTO, very little 
was needed to make it compile, and I did upstream most changes such as 
commit 75fea300d7, commit a85b2257a5, commit 5d48417592, commit 
19c233b79d, etc.

> I tried the lto-5.8.1 branch, but
> I did not even succeed in building x86 + LTO.

My latest working LTO branch (i.e. last time I worked on it) is much 
older than that.

Maybe people aren't very excited about LTO because it makes the time to 
recompiling the kernel many times longer because gcc does its 
optimization passes on the whole kernel even if you modify a single 
file.


Nicolas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ