[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160601141607.GF355@arm.com>
Date: Wed, 1 Jun 2016 15:16:07 +0100
From: Will Deacon <will.deacon@....com>
To: David Howells <dhowells@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, linux-arch@...r.kernel.org,
x86@...nel.org, linux-kernel@...r.kernel.org,
ramana.radhakrishnan@....com, paulmck@...ux.vnet.ibm.com,
dwmw2@...radead.org
Subject: Re: [RFC PATCH 03/15] Provide atomic_t functions implemented with
ISO-C++11 atomics
On Thu, May 19, 2016 at 10:52:19AM +0100, David Howells wrote:
> Peter Zijlstra <peterz@...radead.org> wrote:
>
> > Does this generate 'sane' code for LL/SC archs? That is, a single LL/SC
> > loop and not a loop around an LL/SC cmpxchg.
>
> Depends on your definition of 'sane'. The code will work - but it's not
> necessarily the most optimal. gcc currently keeps the __atomic_load_n() and
> the fudging in the middle separate from the __atomic_compare_exchange_n().
>
> So on aarch64:
>
> static __always_inline int __atomic_add_unless(atomic_t *v,
> int addend, int unless)
> {
> int cur = __atomic_load_n(&v->counter, __ATOMIC_RELAXED);
> int new;
>
> do {
> if (__builtin_expect(cur == unless, 0))
> break;
> new = cur + addend;
> } while (!__atomic_compare_exchange_n(&v->counter,
> &cur, new,
> false,
> __ATOMIC_SEQ_CST,
> __ATOMIC_RELAXED));
> return cur;
> }
>
> int test_atomic_add_unless(atomic_t *counter)
> {
> return __atomic_add_unless(counter, 0x56, 0x23);
> }
[...]
> I think the code it generates should look something like:
>
> test_atomic_add_unless:
> .L7:
> ldaxr w1, [x0] # __atomic_load_n()
> cmp w1, 35 # } if (cur == unless)
> beq .L4 # } break
> add w2, w1, 86 # new = cur + addend
> stlxr w4, w2, [x0]
> cbnz w4, .L7
> .L4:
> mov w1, w0
> ret
>
> but that requires the compiler to split up the LDAXR and STLXR instructions
> and render arbitrary code between. I suspect that might be quite a stretch.
... it's also weaker than the requirements of the kernel memory model.
See 8e86f0b409a4 ("arm64: atomics: fix use of acquire + release for full
barrier semantics") for the gory details.
Will
Powered by blists - more mailing lists