lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Dec 2009 13:10:56 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jan Beulich <JBeulich@...ell.com>
cc:	mingo@...e.hu, tglx@...utronix.de, hpa@...or.com,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: slightly shorten __ticket_spin_trylock() (v2)



On Thu, 3 Dec 2009, Jan Beulich wrote:
>  
> -static __always_inline int __ticket_spin_trylock(raw_spinlock_t *lock)
> +static __always_inline u8 __ticket_spin_trylock(raw_spinlock_t *lock)
>  {
>  	int tmp, new;
>  
> @@ -88,12 +88,11 @@ static __always_inline int __ticket_spin
>  		     LOCK_PREFIX "cmpxchgw %w1,%2\n\t"
>  		     "1:"
>  		     "sete %b1\n\t"
> -		     "movzbl %b1,%0\n\t"
>  		     : "=&a" (tmp), "=&q" (new), "+m" (lock->slock)
>  		     :
>  		     : "memory", "cc");
>  
> -	return tmp;
> +	return new;

This is fairly pessimal register allocation.

It used to be that we returned the value in 'tmp', which is %eax, which is 
also the expected return register. Now that we use 'new', it's some random 
other register that is _not_ %eax, which means that while we avoid a 
'movzbl', the regular spin_trylock function call case will now have the 
compiler emitting a 

	movb %X,%al
	ret

at the end just to get the right return register.

Which seems a bit annoying. It looks like we could have done that last 
instruction as just

	sete %b0

instead, and then return 'tmp' instead of 'new', keeping the return value 
in %al and avoiding the unnecessary movement.

> -static __always_inline int __ticket_spin_trylock(raw_spinlock_t *lock)
> +static __always_inline u8 __ticket_spin_trylock(raw_spinlock_t *lock)

Same thing here, afaik.

NOTE! I haven't actually looked at the generated code, and if we actually 
inline it all the way to the caller, it won't matter (and picking another 
register may even help). But while these helpers are marked 
__always_inline, I _thought_ that the way we actually build the final 
'spin_trylock()' function we end up with a real function in the end.

Maybe I'm wrong.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ