[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a047ea5-7717-d089-48bf-597434be7c4c@redhat.com>
Date: Fri, 9 Feb 2018 20:30:55 +0100
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Joerg Roedel <jroedel@...e.de>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
"H . Peter Anvin" <hpa@...or.com>,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...el.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>, Jiri Kosina <jkosina@...e.cz>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Brian Gerst <brgerst@...il.com>,
David Laight <David.Laight@...lab.com>,
Eduardo Valentin <eduval@...zon.com>,
Greg KH <gregkh@...uxfoundation.org>,
Will Deacon <will.deacon@....com>,
"Liguori, Anthony" <aliguori@...zon.com>,
Daniel Gruss <daniel.gruss@...k.tugraz.at>,
Hugh Dickins <hughd@...gle.com>,
Kees Cook <keescook@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Waiman Long <llong@...hat.com>, Pavel Machek <pavel@....cz>
Subject: Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack
On 02/09/2018 08:02 PM, Joerg Roedel wrote:
> On Fri, Feb 09, 2018 at 09:05:02AM -0800, Linus Torvalds wrote:
>> On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel <joro@...tes.org> wrote:
>>> +
>>> + /* Copy over the stack-frame */
>>> + cld
>>> + rep movsb
>>
>> Ugh. This is going to be horrendous. Maybe not noticeable on modern
>> CPU's, but the whole 32-bit code is kind of pointless on a modern CPU.
>>
>> At least use "rep movsl". If the kernel stack isn't 4-byte aligned,
>> you have issues.
>
> Okay, I used movsb because I remembered that being the recommendation
> for the most efficient memcpy, and it safes me an instruction. But that
> is probably only true on modern CPUs.
It's fast (copies data with full-width loads and stores,
up to 64-byte wide on latest Intel CPUs), but this kicks in only for
largish blocks. In your case, you are copying less than 100 bytes.
Powered by blists - more mailing lists