lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Feb 2019 11:35:24 -0500
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will.deacon@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, linux-alpha@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-hexagon@...r.kernel.org, linux-ia64@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-sh@...r.kernel.org,
        sparclinux@...r.kernel.org, linux-xtensa@...ux-xtensa.org,
        linux-arch@...r.kernel.org, x86@...nel.org,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files

On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> Which is clearly worse. Now we can write that as:
>
>   int __down_read_trylock2(unsigned long *l)
>   {
> 	  long tmp = READ_ONCE(*l);
>
> 	  while (tmp >= 0) {
> 		  if (try_cmpxchg(l, &tmp, tmp + 1))
> 			  return 1;
> 	  }
>
> 	  return 0;
>   }
>
> which generates:
>
>   0000000000000030 <__down_read_trylock2>:
>   30:   48 8b 07                mov    (%rdi),%rax
>   33:   48 85 c0                test   %rax,%rax
>   36:   78 18                   js     50 <__down_read_trylock2+0x20>
>   38:   48 8d 50 01             lea    0x1(%rax),%rdx
>   3c:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
>   41:   75 f0                   jne    33 <__down_read_trylock2+0x3>
>   43:   b8 01 00 00 00          mov    $0x1,%eax
>   48:   c3                      retq
>   49:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
>   50:   31 c0                   xor    %eax,%eax
>   52:   c3                      retq
>
> Which is a lot better; but not quite there yet.
>
>
> I've tried quite a bit, but I can't seem to get GCC to generate the:
>
> 	add $1,%rdx
> 	jle
>
> required; stuff like:
>
> 	new = old + 1;
> 	if (new <= 0)
>
> generates:
>
> 	lea 0x1(%rax),%rdx
> 	test %rdx, %rdx
> 	jle

Thanks for the suggested code snippet. So you want to replace "lea
0x1(%rax), %rdx" by "add $1,%rdx"?

I think the compiler is doing that so as to use the address generation
unit for addition instead of using the ALU. That will leave the ALU
available for doing other arithmetic operation in parallel. I don't
think it is a good idea to override the compiler and force it to use
ALU. So I am not going to try doing that. It is only 1 or 2 more of
codes anyway.

Cheers,
Longman

Powered by blists - more mailing lists