linux-kernel - Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPM31RKp=tAz8TQ=tCGQRNHUKWvrC9B4LV3wG+hBUr+rG_FMsQ@mail.gmail.com>
Date:   Thu, 4 Jan 2018 01:24:41 -0800
From:   Paul Turner <pjt@...gle.com>
To:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...ux-foundation.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>, tglx@...uxtronix.de,
        Kees Cook <keescook@...gle.com>,
        Rik van Riel <riel@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...capital.net>,
        Jiri Kosina <jikos@...nel.org>, gnomes@...rguk.ukuu.org.uk
Subject: Re: [RFC] Retpoline: Binary mitigation for branch-target-injection
 (aka "Spectre")

On Thu, Jan 4, 2018 at 1:10 AM, Paul Turner <pjt@...gle.com> wrote:
> Apologies for the discombobulation around today's disclosure.  Obviously the
> original goal was to communicate this a little more coherently, but the
> unscheduled advances in the disclosure disrupted the efforts to pull this
> together more cleanly.
>
> I wanted to open discussion the "retpoline" approach and and define its
> requirements so that we can separate the core
> details from questions regarding any particular implementation thereof.
>
> As a starting point, a full write-up describing the approach is available at:
>   https://support.google.com/faqs/answer/7625886
>
> The 30 second version is:
> Returns are a special type of indirect branch.  As function returns are intended
> to pair with function calls, processors often implement dedicated return stack
> predictors.  The choice of this branch prediction allows us to generate an
> indirect branch in which speculative execution is intentionally redirected into
> a controlled location by a return stack target that we control.  Preventing
> branch target injections (also known as "Spectre") against these binaries.
>
> On the targets (Intel Xeon) we have measured so far, cost is within cycles of a
> "native" indirect branch for which branch prediction hardware has been disabled.
> This is unfortunately measurable -- from 3 cycles on average to about 30.
> However the cost is largely mitigated for many workloads since the kernel uses
> comparatively few indirect branches (versus say, a C++ binary).  With some
> effort we have the average overall overhead within the 0-1.5% range for our
> internal workloads, including some particularly high packet processing engines.
>
> There are several components, the majority of which are independent of kernel
> modifications:
>
> (1) A compiler supporting retpoline transformations.

An implementation for LLVM is available at:
  https://reviews.llvm.org/D41723

> (1a) Optionally: annotations for hand-coded indirect jmps, so that they may be
>     made compatible with (1).
>     [ Note: The only known indirect jmp which is not safe to convert, is the
>       early virtual address check in head entry. ]
> (2) Kernel modifications for preventing return-stack underflow (see document
>     above).
>    The key points where this occurs are:
>    - Context switches (into protected targets)
>    - interrupt return (we return into potentially unwinding execution)
>    - sleep state exit (flushes cashes)
>    - guest exit.
>   (These can be run-time gated, a full refill costs 30-45 cycles.)
> (3) Optional: Optimizations so that direct branches can be used for hot kernel
>    indirects. While as discussed above, kernel execution generally depends on
>    fewer indirect branches, there are a few places (in particular, the
>    networking stack) where we have chained sequences of indirects on hot paths.
> (4) More general support for guarding against RSB underflow in an affected
>     target.  While this is harder to exploit and may not be required for many
>     users, the approaches we have used here are not generally applicable.
>     Further discussion is required.
>
> With respect to the what these deltas mean for an unmodified kernel:

Sorry this should have been, a kernel that does not care about this protection.

It has been a long day :-).

>  (1a) At minimum annotation only.  More complicated, config and
> run-time gated options are also possigble.
>  (2) Trivially run-time & config gated.
>  (3) The de-virtualizing of these branches improves performance in both the
>      retpoline and non-retpoline cases.
>
> For an out of the box kernel that is reasonably protected, (1)-(3) are required.
>
> I apologize that this does not come with a clean set of patches, merging the
> things that we and Intel have looked at here.  That was one of the original
> goals for this week.  Strictly speaking, I think that Andi, David, and I have
> a fair amount of merging and clean-up to do here.  This is an attempt
> to keep discussion of the fundamentals at least independent of that.
>
> I'm trying to keep the above reasonably compact/dense.  I'm happy to expand on
> any details in sub-threads.  I'll also link back some of the other compiler work
> which is landing for (1).
>
> Thanks,
>
> - Paul