lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50431bff2cda445490f5242c1189c8cd@AcuMS.aculab.com>
Date:   Sat, 10 Feb 2018 15:26:51 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Denys Vlasenko' <dvlasenk@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Joerg Roedel <joro@...tes.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        the arch/x86 maintainers <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>, Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Juergen Gross <jgross@...e.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>, Jiri Kosina <jkosina@...e.cz>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Brian Gerst <brgerst@...il.com>,
        "Eduardo Valentin" <eduval@...zon.com>,
        Greg KH <gregkh@...uxfoundation.org>,
        "Will Deacon" <will.deacon@....com>,
        "Liguori, Anthony" <aliguori@...zon.com>,
        Daniel Gruss <daniel.gruss@...k.tugraz.at>,
        Hugh Dickins <hughd@...gle.com>,
        Kees Cook <keescook@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Waiman Long <llong@...hat.com>, Pavel Machek <pavel@....cz>,
        Joerg Roedel <jroedel@...e.de>
Subject: RE: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

From: Denys Vlasenko
> Sent: 09 February 2018 17:17
> On 02/09/2018 06:05 PM, Linus Torvalds wrote:
> > On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel <joro@...tes.org> wrote:
> >> +
> >> +       /* Copy over the stack-frame */
> >> +       cld
> >> +       rep movsb
> >
> > Ugh. This is going to be horrendous. Maybe not noticeable on modern
> > CPU's, but the whole 32-bit code is kind of pointless on a modern CPU.
> >
> > At least use "rep movsl". If the kernel stack isn't 4-byte aligned,
> > you have issues.

The alignment doesn't matter, 'rep movsl' will still work.

> Indeed, "rep movs" has some setup overhead that makes it undesirable
> for small sizes. In my testing, moving less than 128 bytes with "rep movs"
> is a loss.

It very much depends on the cpu.

Recent (Haswell?) Intel cpus have hardware support for optimising 'rep movsb'
for cached memory locations so that it is fast regardless of the alignments.
The setup cost is fairly small.

The previous generation had an optimisation for 'rep movsb' for less than
7 bytes, but for larger values the setup cost was significantly higher.
On these cpu you needed to use 'rep movsd' (64 bits is best) for the bulk
of a copy.

Actually, instead of using 'rep movsb' to copy the odd few bytes, for
memcpy() you can copy the last (misaligned) 8 bytes first then use
'rep movsd' for the bulk of the copy.

On Netburst P4 the setup cost for any 'rep movs' was something like 45 clocks.
You really didn't want to use them for short copies.
(A C compiler from a well known OS supplier will 'optimise' any copy loop
into 'rep movsb' - not entirely the best of optimisations!)

I also managed to match the per-cycle cost of 'rep movsl' with a copy
loop on my Athlon-700 (but not the setup cost, on a P4 I might have
beaten the setup cost as well).

	David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ