[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQJf317mXSDLs=K0pzTDGqMA8vqSDoNm5=LvEst6kdAi6w@mail.gmail.com>
Date: Tue, 2 Sep 2025 10:43:41 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, LKML <linux-kernel@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, bpf <bpf@...r.kernel.org>,
Arnd Bergmann <arnd@...db.de>, Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>, Mark Rutland <mark.rutland@....com>, harisokn@...zon.com,
cl@...two.org, Alexei Starovoitov <ast@...nel.org>, Kumar Kartikeya Dwivedi <memxor@...il.com>,
zhenglifeng1@...wei.com, xueshuai@...ux.alibaba.com,
joao.m.martins@...cle.com, Boris Ostrovsky <boris.ostrovsky@...cle.com>,
konrad.wilk@...cle.com
Subject: Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
On Mon, Sep 1, 2025 at 4:28 AM Catalin Marinas <catalin.marinas@....com> wrote:
>
> On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
> > diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
> > index a385603436e9..ce8feadeb9a9 100644
> > --- a/arch/arm64/include/asm/rqspinlock.h
> > +++ b/arch/arm64/include/asm/rqspinlock.h
> > @@ -3,6 +3,9 @@
> > #define _ASM_RQSPINLOCK_H
> >
> > #include <asm/barrier.h>
> > +
> > +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
>
> More on this below, I don't think we should define it.
>
> > diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> > index 5ab354d55d82..8de1395422e8 100644
> > --- a/kernel/bpf/rqspinlock.c
> > +++ b/kernel/bpf/rqspinlock.c
> > @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
> > u64 duration;
> > u64 cur;
> > u16 spin;
> > + u8 wait;
> > };
> >
> > #define RES_TIMEOUT_VAL 2
> > @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
> > }
> >
> > /*
> > - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
> > - * as the macro does internal amortization for us.
> > + * Only amortize with spins when we don't have a waiting implementation.
> > */
> > -#ifndef res_smp_cond_load_acquire
> > #define RES_CHECK_TIMEOUT(ts, ret, mask) \
> > ({ \
> > - if (!(ts).spin++) \
> > + if ((ts).wait || !(ts).spin++) \
> > (ret) = check_timeout((lock), (mask), &(ts)); \
> > (ret); \
> > })
> > -#else
> > -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
> > - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
> > -#endif
>
> IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
> doesn't amortise the spins, as the comment suggests, but rather the
> calls to check_timeout(). This is fine, it matches the behaviour of
> smp_cond_load_relaxed_timewait() you introduced in the first patch. The
> only difference is the number of spins - 200 (matching poll_idle) vs 64K
> above. Does 200 work for the above?
>
> > /*
> > * Initialize the 'spin' member.
> > * Set spin member to 0 to trigger AA/ABBA checks immediately.
> > */
> > -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
> > +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
>
> First of all, I don't really like the smp_cond_load_acquire_waiting(),
> that's an implementation detail of smp_cond_load_*_timewait() that
> shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
> also used outside the smp_cond_load_acquire_timewait() condition. The
> (ts).wait check only makes sense when used together with the WFE
> waiting.
+1 to the above.
Penalizing all other architectures with pointless runtime check:
> - if (!(ts).spin++) \
> + if ((ts).wait || !(ts).spin++) \
is not acceptable.
Powered by blists - more mailing lists