[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVFOKC91hKWxm+-w2v2FPGJ37RoCOYX5FYpszjMrAaUOQ@mail.gmail.com>
Date: Fri, 24 Mar 2017 09:54:46 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Dmitry Vyukov <dvyukov@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Brian Gerst <brgerst@...il.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: locking/atomic: Introduce atomic_try_cmpxchg()
On Fri, Mar 24, 2017 at 9:41 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Fri, Mar 24, 2017 at 03:21:40PM +0100, Peter Zijlstra wrote:
>> On Fri, Mar 24, 2017 at 01:44:00PM +0100, Dmitry Vyukov wrote:
>
>> > So I would suggest to change it to a safer and less surprising
>> > alternative:
>> >
>> > diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
>> > index fb961db51a2a..81fb985f51f4 100644
>> > --- a/arch/x86/include/asm/cmpxchg.h
>> > +++ b/arch/x86/include/asm/cmpxchg.h
>> > @@ -212,7 +212,8 @@ extern void __add_wrong_size(void)
>> > default: \
>> > __cmpxchg_wrong_size(); \
>> > } \
>> > - *_old = __old; \
>> > + if (!success) \
>> > + *_old = __old; \
>> > success; \
>> > })
>>
>> I've no immediate objection, I'll double check what, if anything, it
>> does for code gen.
>
> So the first snipped I tested regressed like so:
>
>
> 0000000000000000 <T_refcount_inc>: 0000000000000000 <T_refcount_inc>:
> 0: 8b 07 mov (%rdi),%eax 0: 8b 17 mov (%rdi),%edx
> 2: 83 f8 ff cmp $0xffffffff,%eax 2: 83 fa ff cmp $0xffffffff,%edx
> 5: 74 13 je 1a <T_refcount_inc+0x1a> 5: 74 1a je 21 <T_refcount_inc+0x21>
> 7: 85 c0 test %eax,%eax 7: 85 d2 test %edx,%edx
> 9: 74 0d je 18 <T_refcount_inc+0x18> 9: 74 13 je 1e <T_refcount_inc+0x1e>
> b: 8d 50 01 lea 0x1(%rax),%edx b: 8d 4a 01 lea 0x1(%rdx),%ecx
> e: f0 0f b1 17 lock cmpxchg %edx,(%rdi) e: 89 d0 mov %edx,%eax
> 12: 75 ee jne 2 <T_refcount_inc+0x2> 10: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
> 14: ff c2 inc %edx 14: 74 04 je 1a <T_refcount_inc+0x1a>
> 16: 75 02 jne 1a <T_refcount_inc+0x1a> 16: 89 c2 mov %eax,%edx
> 18: 0f 0b ud2 18: eb e8 jmp 2 <T_refcount_inc+0x2>
> 1a: c3 retq 1a: ff c1 inc %ecx
> 1c: 75 03 jne 21 <T_refcount_inc+0x21>
> 1e: 0f 0b ud2
> 20: c3 retq
> 21: c3 retq
Can you re-send the better asm you got earlier?
If I pretend to be a dumb compiler, I wonder if you'd get better results with:
if (!success) {
*_old = __old;
return false;
} else {
return true;
}
or however you jam that into a statement expression. That way you
aren't relying on the compiler to merge the branches.
>
> Which is rather unfortunate...
--
Andy Lutomirski
AMA Capital Management, LLC
Powered by blists - more mailing lists