lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4972738A.6030506@zytor.com>
Date:	Sat, 17 Jan 2009 16:10:50 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Kyle McMartin <kyle@...radead.org>, Ingo Molnar <mingo@...e.hu>,
	Mikael Pettersson <mikpe@...uu.se>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: "eliminate warn_on_slowpath()" change causes many gcc-3.2.3 warnings

Linus Torvalds wrote:
> 
> On Sat, 17 Jan 2009, H. Peter Anvin wrote:
>> At least on x86, the two ops should be the same cost?
> 
> Not with the code Kyle had, which forces a memory load.
> 
> But yes, with a constant address, it at least comes close. But with a 
> small explicit constant value, the compiler can often do even better. For 
> example, you can generate a 64-bit -1 in many ways, while a 64-bit random 
> address is much more work to generate.
> 
> Of course, I don't know how much gcc takes advantage of this. Maybe it 
> always just generates a silly "movq" rather than being smarter about it 
> (eg "orl $-1,reg" can do it in four bytes, I think, because you can use a 
> single-byte constant).
> 
> Of course, zero is even easier to generate, so NULL is the best constant 
> of all, but generally small integers are more amenable to optimization 
> than generic addresses. They're also generally easier to test for.
> 

When compiling with -O2 -mcmodel=kernel on gcc 4.3.2, you end up with
the same 7-byte sequence:

   4:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        7: R_X86_64_32S bluttan
  10:   48 c7 c7 ff ff ff ff    mov    $0xffffffffffffffff,%rdi

With -Os -mcmodel=kernel, it's a bit better:


   4:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        7: R_X86_64_32S bluttan
  10:   48 83 cf ff             or     $0xffffffffffffffff,%rdi

I would have expected it to have used leaq in the first case, but it's
the same length (7 bytes) and probably has higher latencies.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ