lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 May 2020 19:20:26 -0700
From:   Linus Torvalds <>
To:     Nick Desaulniers <>
Cc:     Borislav Petkov <>, Arnd Bergmann <>,
        Arvind Sankar <>,
        Kalle Valo <>,
        linux-wireless <>,
        "" <>,
        "the arch/x86 maintainers" <>,
        Kees Cook <>,
        Thomas Gleixner <>
Subject: Re: gcc-10: kernel stack is corrupted and fails to boot

On Wed, May 13, 2020 at 5:51 PM Nick Desaulniers
<> wrote:
> Are you sure LTO treats empty asm statements differently than full
> memory barriers in regards to preventing tail calls?

It had better.

At link-time, there is nothing left of an empty asm statement. So by
the time the linker runs, it only sees

    call xyz

in the object code. At that point, it's somewhat reasonable for any
link-time optimizer (or an optimizing assembler, for that matter) to
say "I'll just turn that sequence into a simple 'jmp xyz' instead".

Now, don't get me wrong - I'm not convinced any existing LTO does
that. But I'd also not be shocked by something like that.

In contrast, if it's a real mb(), the linker won't see just a
'call+ret" sequence. It will see something like

    call xyz

(ok, the mfence may actually be something else, and we'll have a label
on it and an alternatives table pointing to it, but the point is,
unlike an empty asm, there's something _there_).

Now, if the linker turns that 'call' into a 'jmp', the linker is just
plain buggy.

See the difference?

> The TL;DR of the very long thread is that
> is a proper fix, on
> the GCC side.  Adding arbitrary empty asm statements to work around
> it? Hacks.  Full memory barriers? Hacks.


A compiler person might call it a "hack". But said compiler person
will be _wrong_.

An intelligent developer knows that it will take years for compilers
to give us all the infrastructure we need, and even then the compiler
won't actually give us everything - and people will be using the old
compilers for years anyway.

That's why inline asm's exist. They are the escape from the excessive
confines of "let's wait for the compiler person to solve this for us"
- which they'll never do completely anyway.

It's a bit like unsafe C type casts and allowing people to write
"non-portable code". Some compiler people will say that it's bad, and
unsafe. Sure, it can be unsafe, but the point is that it allows you to
do things that aren't necessarily _possible_ to do in an overly
restrictive language.

Sometimes you need to break the rules.

There's a reason everybody writes library routines in "unsafe"
languages like C. Because you need those kinds of escapes in order to
actually do something like a memory allocator etc.

And that's also why we have inline asm - because the compiler will
never know everything or be able to generate code for everything
people wants to do.

And anybody that _thinks_ that the compiler will always know better
and should be in complete control shouldn't be working on compilers.
They should probably be repeating kindergarten, sitting in a corner
eating paste with their friends.

So no. Using inline asms to work around the compiler not understanding
what is going on isn't a "hack". It's the _point_ of inline asm.

Is it perfect? No. But it's not like there are many alternatives.


Powered by blists - more mailing lists