[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181019035340.ahjocmdj2o2zam4m@ast-mbp.dhcp.thefacebook.com>
Date: Thu, 18 Oct 2018 20:53:42 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: Peter Zijlstra <peterz@...radead.org>, paulmck@...ux.vnet.ibm.com,
will.deacon@....com, acme@...hat.com, yhs@...com,
john.fastabend@...il.com, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers
instead of {rmb,mb}
On Thu, Oct 18, 2018 at 09:00:46PM +0200, Daniel Borkmann wrote:
> On 10/18/2018 05:33 PM, Alexei Starovoitov wrote:
> > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
> >> #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */
> >> diff --git a/tools/arch/powerpc/include/asm/barrier.h b/tools/arch/powerpc/include/asm/barrier.h
> >> index a634da0..905a2c6 100644
> >> --- a/tools/arch/powerpc/include/asm/barrier.h
> >> +++ b/tools/arch/powerpc/include/asm/barrier.h
> >> @@ -27,4 +27,20 @@
> >> #define rmb() __asm__ __volatile__ ("sync" : : : "memory")
> >> #define wmb() __asm__ __volatile__ ("sync" : : : "memory")
> >>
> >> +#if defined(__powerpc64__)
> >> +#define smp_lwsync() __asm__ __volatile__ ("lwsync" : : : "memory")
> >> +
> >> +#define smp_store_release(p, v) \
> >> +do { \
> >> + smp_lwsync(); \
> >> + WRITE_ONCE(*p, v); \
> >> +} while (0)
> >> +
> >> +#define smp_load_acquire(p) \
> >> +({ \
> >> + typeof(*p) ___p1 = READ_ONCE(*p); \
> >> + smp_lwsync(); \
> >> + ___p1; \
> >
> > I don't like this proliferation of asm.
> > Why do we think that we can do better job than compiler?
> > can we please use gcc builtins instead?
> > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
> > __atomic_store_n(ptr, val, __ATOMIC_RELEASE);
> > are done specifically for this use case if I'm not mistaken.
> > I think it pays to learn what compiler provides.
>
> But are you sure the C11 memory model matches exact same model as kernel?
> Seems like last time Will looked into it [0] it wasn't the case ...
I'm only suggesting equivalence of __atomic_load_n(ptr, __ATOMIC_ACQUIRE)
with kernel's smp_load_acquire().
I've seen a bunch of user space ring buffer implementations implemented
with __atomic_load_n() primitives.
But let's ask experts who live in both worlds.
Paul,
what would you recommend?
Should we copy paste smp_store_release() from kernel to be used
in user space library/tools
or use __atomic_load_n() builtins instead?
> The above was pulled in and slightly adapted from kernel side of arch
> asm barriers. Hm, it would probably be safest if an arch decides to adapt
> C11 barriers first from kernel side and user space could then use the
> exact same matching builtin functions for scenarios like these as well.
>
> [0] https://lore.kernel.org/lkml/20170308174300.GL20400@arm.com/
Powered by blists - more mailing lists