lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Jun 2015 22:20:08 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Denys Vlasenko <dvlasenk@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Borislav Petkov <bp@...en8.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andy Lutomirski <luto@...capital.net>,
	Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Will Drewry <wad@...omium.org>,
	Kees Cook <keescook@...omium.org>, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/5] x86/asm/entry/32: Replace RESTORE_RSI_RDI[_RDX] with
 open-coded 32-bit reads


* Denys Vlasenko <dvlasenk@...hat.com> wrote:

> On 06/14/2015 10:40 AM, Ingo Molnar wrote:
> > 
> > * Denys Vlasenko <dvlasenk@...hat.com> wrote:
> > 
> >> 	+8b 74 24 68            mov    0x68(%rsp),%esi
> >> 	+8b 7c 24 70            mov    0x70(%rsp),%edi
> >> 	+8b 54 24 60            mov    0x60(%rsp),%edx
> > 
> > Btw., could you (in another patch) order the restoration properly, by pt_regs 
> > memory order, where possible?
> 
> Will do.
> 
> > So this:
> > 
> >> +	movl	RSI(%rsp), %esi
> >> +	movl	RDI(%rsp), %edi
> >> +	movl	RDX(%rsp), %edx
> >>  	movl	RIP(%rsp), %ecx
> >>  	movl	EFLAGS(%rsp), %r11d
> > 
> > would become:
> > 
> > 	movl	RDX(%rsp), %edx
> > 	movl	RSI(%rsp), %esi
> > 	movl	RDI(%rsp), %edi
> > 	movl	RIP(%rsp), %ecx
> >  	movl	EFLAGS(%rsp), %r11d
> > 
> > ... or so.
> 
> Actually, ecx and r11 need to be loaded first. They are not so much "restored" 
> as "prepared for SYSRET insn". Every cycle lost in loading these delays SYSRET. 
> [...]

So in the typical case they will still be cached, and so their max latency should 
be around 3 cycles.

In fact because they are memory loads, they don't really have dependencies, so 
they should be available to SYSRET almost immediately, i.e. within a cycle - and 
there's no reason to believe why these loads wouldn't pipeline properly and 
parallelize with the many other things SYSRET has to do to organize a return to 
user-space, before it can actually use the target RIP and RFLAGS.

So I strongly doubt that the placement of the RCX and R11 load before the SYSRET 
matters to performance.

In any case this should be testable by looking at syscall performance and 
reordering the instructions.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ