[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFw_Dj7Lx_bE0B3mDxMUNz_K1yc=saziC9dmA_dKSkFWiw@mail.gmail.com>
Date: Tue, 6 May 2014 14:00:13 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Gleb Natapov <gleb@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [ABOMINATION] x86: Fast interrupt return to userspace
On Tue, May 6, 2014 at 1:35 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Heh. That is pretty disgusting. But I guess it could be interesting
> for timing. BRB.
Ooh. That's friggin impressive.
Guys, see if you can recreate these numbers. This is my totally
disgusting test-case, which really is just stress-testing page faults
and nothing else.
Silly C file attached, see the comment at the top of it. Then just do
"time ./a.out". It's designed to map the zero-page and access it. The
"start" thing was to make sure it's not hugepage-aligned, but that's
not actually enough with a big 1GB area, so you do need that whole
"echo never" thing since there will be tons of aligned areas that the
kernel will make noops for this case otherwise.
Anyway, on my Haswell with normal "iret", that program takes 8.4+-0.1 seconds.
With the disgusting sysret hackery, it takes 6.5+-0.1 seconds. That's
a rather impressive 23% performance improvement for page faulting.
I'll do profiles and test the kernel compile too, but the raw timings
are certainly promising. The "sysret" hack is pretty disgusting, and
it's broken too. sysret doesn't do some things iret does (like TF flag
etc), so it's not complete, but it's clearly good enough to run tests
on. It will definitely break ptrace() and friends.
Linus
View attachment "t.c" of type "text/x-csrc" (669 bytes)
Powered by blists - more mailing lists