lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sat, 13 Jun 2015 08:30:36 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	"H. Peter Anvin" <hpa@...or.com>
Cc:	Denys Vlasenko <dvlasenk@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Borislav Petkov <bp@...en8.de>,
	Andy Lutomirski <luto@...capital.net>,
	Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Will Drewry <wad@...omium.org>,
	Kees Cook <keescook@...omium.org>,
	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/asm/entry/32: Slightly better handling of syscall
 errors in auditing


* H. Peter Anvin <hpa@...or.com> wrote:

> I think you misunderstand partial register stalls.  They happen (on some 
> microarchitectures) when you write part of a register and then use the whole 
> register.

Yes, there's no partial register stall in this or later code handling these 
values.

> > "setbe %al" insn has a register merge stall: it needs to combine previous %eax 
> > value with new value for the lowest byte. Subsequent "movzbl %al,%edi" in turn 
> > depends on its completion.
> > 
> > This patch replaces "setbe %al + movzbl %al,%edi" pair of insns with "xor 
> > %edi,%edi" before the comparison, and conditional "inc %edi".

So here's the code in wider context:

>    cmpl      $-MAX_ERRNO, %eax     /* is it an error ? */
>    jbe       1f
>    movslq    %eax, %rsi            /* if error sign extend to 64 bits */
> 1: setbe     %al                   /* 1 if error, 0 if not */
>    movzbl    %al, %edi             /* zero-extend that into %edi */

What happens here is that at the point the SETBE executes it needs to know the 
previous 32-bit value of EAX. But the previous JBE needs to know it already (it 
needs the CF and ZF result of the CMPL comparison), so there's no real additional 
dependency.

(The MOVSLQ of EAX will likewise already have the full value of EAX, because the 
already JBE needs it.)

Furthermore, the following SETBE sets an entirely new value for the 8-bit AL. The 
'entirely new value' will be handled by modern uarchs with register renaming (and 
marking that it's a rename for the low byte of EAX), giving the new value a 
separate, independent path to compute and use - and that renamed register value 
will be moved into EDI (zero-extended).

The CPU might eventually have to merge the previous value of EAX with the new 
value for AL, but there's no dependency on it in this piece of code. If there was 
a dependency on the full value then _that_ would create a partial register stall.

And as it happens, there's no such subsequent dependency, because we call a C 
function right away:

       call    __audit_syscall_exit

and RAX is a freely available register used as the return code. It's being 
overwritten early in the __audit_syscall_exit() function's execution by zeroing:

    28d4:       19 c0                   sbb    %eax,%eax

which will fully overwrite the previous partial value without extra dependencies.

So the real motivation of the patch is to simplify the setting of EDI to 0 or 1 by 
using a branch we already execute.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ