linux-kernel - Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180105112509.GD253582@google.com>
Date:   Fri, 5 Jan 2018 03:25:09 -0800
From:   Paul Turner <pjt@...gle.com>
To:     David Woodhouse <dwmw2@...radead.org>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Kees Cook <keescook@...gle.com>,
        Rik van Riel <riel@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...capital.net>,
        Jiri Kosina <jikos@...nel.org>,
        One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>
Subject: Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

On Fri, Jan 05, 2018 at 10:55:38AM +0000, David Woodhouse wrote:
> On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote:
> > On Thu, Jan 04, 2018 at 07:27:58PM +0000, David Woodhouse wrote:
> > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> > > > 
> > > > Pretty much.
> > > > Paul's writeup: https://support.google.com/faqs/answer/7625886
> > > > tldr: jmp *%r11 gets converted to:
> > > > call set_up_target;
> > > > capture_spec:
> > > >   pause;
> > > >   jmp capture_spec;
> > > > set_up_target:
> > > >   mov %r11, (%rsp);
> > > >   ret;
> > > > where capture_spec part will be looping speculatively.
> > > 
> > > That is almost identical to what's in my latest patch set, except that
> > > the capture_spec loop has 'lfence' instead of 'pause'.
> > 
> > When choosing this sequence I benchmarked several alternatives here, including
> > (nothing, nops, fences, and other serializing instructions such as cpuid).
> > 
> > The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is why
> > it was chosen.
> > 
> >   "pause; jmp" 33.231 cycles/call 9.517 ns/call
> >   "lfence; jmp" 33.354 cycles/call 9.552 ns/call
> > 
> > (Timings are for a complete retpolined indirect branch.)
> 
> Yeah, I studiously ignored you here and went with only what Intel had
> *assured* me was correct and put into the GCC patches, rather than
> chasing those 35 picoseconds ;)

It's also notable here that while the difference is small in terms of absolute
values, it's likely due to reduced variation:

I would expect:
- pause to be extremely consistent in its timings
- pause and lfence to be close on their average timings, particularly in a
  micro-benchmark.

Which suggests that the difference may be larger in the occasional cases that
you are getting "unlucky" and seeing some other uarch interaction in the lfence
path.
> 
> The GCC patch set already had about four different variants over time,
> with associated "oh shit, that one doesn't actually work; try this".
> What we have in my patch set is precisely what GCC emits at the moment.
> 
> I'm all for optimising it further, but maybe not this week.
> 
> Other than that, is there any other development from your side that I
> haven't captured in the latest (v4) series?
> http://git.infradead.org/users/dwmw2/linux-retpoline.git/