lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 29 Aug 2019 11:15:04 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nick Desaulniers <ndesaulniers@...gle.com>
Cc:     Rasmus Villemoes <linux@...musvillemoes.dk>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>, Nadav Amit <namit@...are.com>,
        Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        clang-built-linux <clang-built-linux@...glegroups.com>,
        Joe Perches <joe@...ches.com>
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On Thu, Aug 29, 2019 at 10:36 AM Nick Desaulniers
<ndesaulniers@...gle.com> wrote:
>
> I'm curious what "the size of the asm" means, and how it differs
> precisely from "how many instructions GCC thinks it is."  I would
> think those are one and the same?  Or maybe "the size of the asm"
> means the size in bytes when assembled to machine code, as opposed to
> the count of assembly instructions?

The problem is that we do different sections in the inline asm, and
the instruction counts are completely bogus as a result.

The actual instruction in the code stream may be just a single
instruction. But the out-of-line sections can be multiple instructions
and/or a data section that contains exception information.

So we want the asm inlined, because the _inline_ part (and the hot
instruction) is small, even though the asm technically maybe generates
many more bytes of additional data.

The worst offenders for this tend to be

 - various exception tables for user accesses etc

 - "alternatives" where we list two or more different asm alternatives
and then pick the right one at boot time depending on CPU ID flags

 - "BUG_ON()" instructions where there's a "ud2" instruction and
various data annotations going with it

so gcc may be "technically correct" that the inline asm statement
contains ten instructions or more, but the actual instruction _code_
footprint in the asm is likely just a single instruction or two.

The statement counting is also completely off by the fact that some of
the "statements" are assembler directives (ie the
".pushsection"/".popsection" lines etc). So some of it is that the
instruction counting is off, but the largest part is that it's just
not relevant to the code footprint in that function.

Un-inlining a function because it contains a single inline asm
instruction is not productive. Yes, it might result in a smaller
binary over-all (because all those other non-code sections do take up
some space), but it actually results in a bigger code footprint.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ