[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <2ce07fbb-03b2-4096-bd76-e7546e20a33c@app.fastmail.com>
Date: Tue, 23 Jul 2024 10:26:05 +0200
From: "Arnd Bergmann" <arnd@...db.de>
To: "Yann Sionneau" <ysionneau@...rayinc.com>, linux-kernel@...r.kernel.org,
"Will Deacon" <will@...nel.org>, "Peter Zijlstra" <peterz@...radead.org>,
"Boqun Feng" <boqun.feng@...il.com>, "Mark Rutland" <mark.rutland@....com>,
"Yury Norov" <yury.norov@...il.com>,
"Rasmus Villemoes" <linux@...musvillemoes.dk>
Cc: "Jonathan Borne" <jborne@...rayinc.com>,
"Julian Vetter" <jvetter@...rayinc.com>,
"Clement Leger" <clement@...ment-leger.fr>,
"Jules Maselbas" <jmaselbas@...v.net>,
"Julien Villette" <julien.villette@...il.com>
Subject: Re: [RFC PATCH v3 15/37] kvx: Add atomic/locking headers
On Mon, Jul 22, 2024, at 11:41, ysionneau@...rayinc.com wrote:
> +#define ATOMIC64_RETURN_OP(op, c_op) \
> +static inline long arch_atomic64_##op##_return(long i, atomic64_t *v) \
> +{ \
> + long new, old, ret; \
> + \
> + do { \
> + old = arch_atomic64_read(v); \
> + new = old c_op i; \
> + ret = arch_cmpxchg(&v->counter, old, new); \
> + } while (ret != old); \
> + \
> + return new; \
> +}
> +
> +#define ATOMIC64_OP(op, c_op) \
> +static inline void arch_atomic64_##op(long i, atomic64_t *v) \
> +{ \
> + long new, old, ret; \
> + \
> + do { \
> + old = arch_atomic64_read(v); \
> + new = old c_op i; \
> + ret = arch_cmpxchg(&v->counter, old, new); \
> + } while (ret != old); \
> +}
These don't look like they are ideal because you have a loop
around arch_cmpxchg(), which is built up from a loop itself.
You may want to change these to be expressed in terms of the
compiler intrinsics directly.
> +#ifndef _ASM_KVX_BARRIER_H
> +#define _ASM_KVX_BARRIER_H
> +
> +/* fence is sufficient to guarantee write ordering */
> +#define mb() __builtin_kvx_fence()
> +
> +#include <asm-generic/barrier.h>
mb() is a fairly strong barrier itself and gets used
as a fallback for all weaker barriers (read-only,
write-only, dma-only, smp-only). Have you checked
if any of them can be less than than
__builtin_kvx_fence(), e.g. a compiler-only barrier(),
like the SMP barriers on x86?
> +
> +#include <asm/cmpxchg.h>
> +
> +static inline int fls(int x)
> +{
> + return 32 - __builtin_kvx_clzw(x);
> +}
> +
> +static inline int fls64(__u64 x)
> +{
> + return 64 - __builtin_kvx_clzd(x);
> +}
The generic fallback for these uses __builtin_clz().
If that produces the same output as the kvx specific
intrintrinsics, you can just remove the above and
use the generic versions.
> +static __always_inline unsigned long __cmpxchg(unsigned long old,
> + unsigned long new,
> + volatile void *ptr, int size)
> +{
> + switch (size) {
> + case 4:
> + return __cmpxchg_u32(old, new, ptr);
> + case 8:
> + return __cmpxchg_u64(old, new, ptr);
> + default:
> + return __cmpxchg_called_with_bad_pointer();
> + }
> +}
With linux-6.11 you now also need to provide a single-byte
cmpxchg(). You can use cmpxchg_emu_u8() or provide a more
efficient custom one based on the 32/64-bit versions instead.
Arnd
Powered by blists - more mailing lists