lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BDEF22F0-84A5-4A36-B153-13E6A4260922@zytor.com>
Date:   Thu, 24 May 2018 11:59:36 -0700
From:   hpa@...or.com
To:     Nick Desaulniers <ndesaulniers@...gle.com>
CC:     Alistair Strachan <astrachan@...gle.com>,
        Manoj Gupta <manojgupta@...gle.com>,
        Matthias Kaehlcke <mka@...gle.com>,
        Greg Hackmann <ghackmann@...gle.com>, sedat.dilek@...il.com,
        tstellar@...hat.com, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [clang] stack protector and f1f029c7bf

On May 23, 2018 3:08:19 PM PDT, Nick Desaulniers <ndesaulniers@...gle.com> wrote:
>H. Peter,
>
>It was reported [0] that compiling the Linux kernel with Clang +
>CC_STACKPROTECTOR_STRONG was causing a crash in native_save_fl(), due
>to
>how GCC does not emit a stack guard for static inline functions (see
>Alistair's excellent report in [1]) but Clang does.
>
>When working with the LLVM release maintainers, Tom had suggested [2]
>changing the inline assembly constraint in native_save_fl() from '=rm'
>to
>'=r', and Alistair had verified the disassembly:
>
>(good) code generated w/o -fstack-protector-strong:
>
>native_save_fl:
>          pushfq
>          popq    -8(%rsp)
>          movq    -8(%rsp), %rax
>          retq
>
>(good) code generated w/ =r input constraint:
>
>native_save_fl:
>          pushfq
>          popq    %rax
>          retq
>
>(bad) code generated with -fstack-protector-strong:
>
>native_save_fl:
>          subq    $24, %rsp
>          movq    %fs:40, %rax
>          movq    %rax, 16(%rsp)
>          pushfq
>          popq    8(%rsp)
>          movq    8(%rsp), %rax
>          movq    %fs:40, %rcx
>          cmpq    16(%rsp), %rcx
>          jne     .LBB0_2
>          addq    $24, %rsp
>          retq
>.LBB0_2:
>          callq   __stack_chk_fail
>
>It looks like the sugguestion is actually a revert of your commit:
>ab94fcf528d127fcb490175512a8910f37e5b346:
>x86: allow "=rm" in native_save_fl()
>
>It seemed like there was a question internally about why worry about
>pop
>adjusting the stack if the stack could be avoided altogether.
>
>I think Sedat can retest this, but I was curious as well about the
>commit
>message in ab94fcf528d: "[ Impact: performance ]", but Alistair's
>analysis
>of the disassembly seems to indicate there is no performance impact (in
>fact, looks better as there's one less mov).
>
>Is there a reason we should not revert ab94fcf528d12, or maybe a better
>approach?
>
>[0] https://lkml.org/lkml/2018/5/7/534
>[1] https://bugs.llvm.org/show_bug.cgi?id=37512#c15
>[2] https://bugs.llvm.org/show_bug.cgi?id=37512#c22

Ok, this is the *second* thing about LLVM-originated bug reports that drives me batty. When you *do* identify a real problem, you propose a paper over and/or talk about it as an LLVM issue and don't consider the often far bigger picture.

Issue 1: Fundamentally, the compiler is doing The Wrong Thing if it generates worse code with a less constrained =rm than with =r. That is a compiler optimization bug, period. The whole point with the less constrained option is to give the compiler the freedom of action.

You are claiming it doesn't buy us anything, but you are only looking at the paravirt case which is kind of "special" (in the short bus kind of way), and only because the compiler in question makes an incredibly stupid decision.

Issue 2: What you are flagging seems to be a far more fundamental problem, which would affect *any* use of push/pop in inline assembly. If that is true, we need to pull in the gcc people too and create an interface to let the compiler know that online assembly needs a certain number of stack slots. We do a lot of push/pop in assembly. The other option is to turn stack canary explicitly off for all such functions.

Issue 3: Let's face it, reading and writing the flags should be builtins, exactly because it has to do stack operations, which really means the compiler should be involved.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ