lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 24 Mar 2017 22:23:29 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Andy Lutomirski <luto@...capital.net>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andy Lutomirski <luto@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg()

On Fri, Mar 24, 2017 at 12:17:28PM -0700, Linus Torvalds wrote:
> On Fri, Mar 24, 2017 at 11:45 AM, Andy Lutomirski <luto@...capital.net> wrote:
> >
> > Is there some hack like if __builtin_is_unescaped(*val) *val = old;
> > that would work?
> 
> See my recent email suggesting a completely different interface, which
> avoids this problem.
> 
> My interface generates:
> 
> 0000000000000000 <T_refcount_inc>:
>    0: 8b 07                 mov    (%rdi),%eax
>    2: 83 f8 ff             cmp    $0xffffffff,%eax
>    5: 74 12                 je     19 <T_refcount_inc+0x19>
>    7: 85 c0                 test   %eax,%eax
>    9: 74 0a                 je     15 <T_refcount_inc+0x15>
>    b: 8d 50 01             lea    0x1(%rax),%edx
>    e: f0 0f b1 17           lock cmpxchg %edx,(%rdi)
>   12: 75 ee                 jne    2 <T_refcount_inc+0x2>
>   14: c3                   retq
>   15: 31 c0                 xor    %eax,%eax
>   17: 0f 0b                 ud2
>   19: c3                   retq
> 
> for PeterZ's test-case, which seems optimal.

Right; now my GCC emits more or less the same code (its a slightly
different compiler and instead of 12: jne, it does: 12 je ; 14: jmp 2.

But maybe that's the likely() you added later.

Also, see how at 7 we test if eax is 0 and then at 9 jump to 15 where we
make eax 0. Pretty daft code-gen.

In any case, you lost one branch into ud2; your success: return, should
be success: if (new == UINT_MAX), such that when we newly saturate the
count we also raise an exception.

With that, the code is still larger than it used to be. I'll have a play
around. I do like this interface better, but getting GCC to generate
sensible code seems 'interesting'.

I'll try and redo the patches that landed in tip and see what it does
for total vmlinux size somewhere tomorrow.

Powered by blists - more mailing lists