lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue,  5 May 2020 15:55:03 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
        clang-built-linux@...glegroups.com, arnd@...db.de
Cc:     "Jason A. Donenfeld" <Jason@...c4.com>,
        Kees Cook <keescook@...omium.org>,
        George Burgess <gbiv@...gle.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>
Subject: [PATCH] Kbuild: disable FORTIFY_SOURCE on clang-10

clang-10 has a broken optimization stage that doesn't enable the
compiler to prove at compile time that certain memcpys are within
bounds, and thus the outline memcpy is always called, resulting in
horrific performance, and in some cases, excessive stack frame growth.
Here's a simple reproducer:

    typedef unsigned long size_t;
    void *c(void *dest, const void *src, size_t n) __asm__("memcpy");
    extern inline __attribute__((gnu_inline)) void *memcpy(void *dest, const void *src, size_t n) { return c(dest, src, n); }
    void blah(char *a)
    {
      unsigned long long b[10], c[10];
      int i;

      memcpy(b, a, sizeof(b));
      for (i = 0; i < 10; ++i)
        c[i] = b[i] ^ b[9 - i];
      for (i = 0; i < 10; ++i)
        b[i] = c[i] ^ a[i];
      memcpy(a, b, sizeof(b));
    }

Compile this with clang-9 and clang-10 and observe:

zx2c4@...nkpad /tmp/curve25519-hacl64-stack-frame-size-test $ clang-10 -Wframe-larger-than=0 -O3 -c b.c -o c10.o
b.c:5:6: warning: stack frame size of 104 bytes in function 'blah' [-Wframe-larger-than=]
void blah(char *a)
     ^
1 warning generated.
zx2c4@...nkpad /tmp/curve25519-hacl64-stack-frame-size-test $ clang-9 -Wframe-larger-than=0 -O3 -c b.c -o c9.o

Looking at the disassembly of c10.o and c9.o, one can see that c9.o is
properly optimized in the obvious way one would expect, while c10.o has
blown up and includes extern calls to memcpy.

This is present on the most trivial bits of code. Thus, for clang-10, we
just set __NO_FORTIFY globally, so that this issue won't be incurred.

Cc: Arnd Bergmann <arnd@...db.de>
Cc: LKML <linux-kernel@...r.kernel.org>
Cc: clang-built-linux <clang-built-linux@...glegroups.com>
Cc: Kees Cook <keescook@...omium.org>
Cc: George Burgess <gbiv@...gle.com>
Cc: Nick Desaulniers <ndesaulniers@...gle.com>
Link: https://bugs.llvm.org/show_bug.cgi?id=45802
Signed-off-by: Jason A. Donenfeld <Jason@...c4.com>
---
 Makefile | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Makefile b/Makefile
index 49b2709ff44e..f022f077591d 100644
--- a/Makefile
+++ b/Makefile
@@ -768,6 +768,13 @@ KBUILD_CFLAGS += -Wno-gnu
 # source of a reference will be _MergedGlobals and not on of the whitelisted names.
 # See modpost pattern 2
 KBUILD_CFLAGS += -mno-global-merge
+
+# clang-10 has a broken optimization stage that causes memcpy to always be
+# outline, resulting in excessive stack frame growth and poor performance.
+ifeq ($(shell test $(CONFIG_CLANG_VERSION) -ge 100000 && test $(CONFIG_CLANG_VERSION) -lt 110000; echo $$?),0)
+KBUILD_CFLAGS += -D__NO_FORTIFY
+endif
+
 else
 
 # These warnings generated too much noise in a regular build.
-- 
2.26.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ