lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 May 2020 17:04:56 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     "Jason A. Donenfeld" <Jason@...c4.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        stable <stable@...r.kernel.org>, "H.J. Lu" <hjl.tools@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jakub Jelinek <jakub@...hat.com>,
        Oleksandr Natalenko <oleksandr@...hat.com>,
        Arnd Bergmann <arnd@...db.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Laight <David.Laight@...lab.com>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>
Subject: Re: [PATCH v2] Kconfig: default to CC_OPTIMIZE_FOR_PERFORMANCE_O3 for
 gcc >= 10

On Mon, May 11, 2020 at 2:57 PM Jason A. Donenfeld <Jason@...c4.com> wrote:
>
> GCC 10 appears to have changed -O2 in order to make compilation time
> faster when using -flto, seemingly at the expense of performance, in
> particular with regards to how the inliner works. Since -O3 these days
> shouldn't have the same set of bugs as 10 years ago, this commit
> defaults new kernel compiles to -O3 when using gcc >= 10.

I'm not convinced this is sensible.

-O3 historically does bad things with gcc. Including bad things for
performance. It traditionally makes code larger and often SLOWER.

And I don't mean slower to compile (although that's an issue). I mean
actually generating slower code.

Things like trying to unroll loops etc makes very little sense in the
kernel, where we very seldom have high loop counts for pretty much
anything.

There's a reason -O3 isn't even offered as an option.

Maybe things have changed, and maybe they've improved. But I'd like to
see actual numbers for something like this.

Not inlining as aggressively is not necessarily a bad thing. It can
be, of course. But I've actually also done gcc bugreports about gcc
inlining too much, and generating _worse_ code as a result (ie
inlinging things that were behind an "if (unlikely())" test, and
causing the likely path to grow a stack fram and stack spills as a
result).

So just "O3 inlines more" is not a valid argument.

              Linus

Powered by blists - more mailing lists