linux-kernel - Re: [PATCH bpf-next v1 14/22] rqspinlock: Add macros for rqspinlock usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP01T77QD_pYBVS4PfG3jDeXObKHZJkV2nQX+0njv11oKTEqRA@mail.gmail.com>
Date: Thu, 9 Jan 2025 02:11:04 +0530
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Waiman Long <llong@...hat.com>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Linus Torvalds <torvalds@...ux-foundation.org>, Peter Zijlstra <peterz@...radead.org>, 
	Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, Martin KaFai Lau <martin.lau@...nel.org>, 
	Eduard Zingerman <eddyz87@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>, Tejun Heo <tj@...nel.org>, 
	Barret Rhoden <brho@...gle.com>, Josh Don <joshdon@...gle.com>, Dohyun Kim <dohyunkim@...gle.com>, 
	kernel-team@...a.com
Subject: Re: [PATCH bpf-next v1 14/22] rqspinlock: Add macros for rqspinlock usage

On Wed, 8 Jan 2025 at 22:26, Waiman Long <llong@...hat.com> wrote:
>
> On 1/7/25 8:59 AM, Kumar Kartikeya Dwivedi wrote:
> > Introduce helper macros that wrap around the rqspinlock slow path and
> > provide an interface analogous to the raw_spin_lock API. Note that
> > in case of error conditions, preemption and IRQ disabling is
> > automatically unrolled before returning the error back to the caller.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@...il.com>
> > ---
> >   include/asm-generic/rqspinlock.h | 58 ++++++++++++++++++++++++++++++++
> >   1 file changed, 58 insertions(+)
> >
> > diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
> > index dc436ab01471..53be8426373c 100644
> > --- a/include/asm-generic/rqspinlock.h
> > +++ b/include/asm-generic/rqspinlock.h
> > @@ -12,8 +12,10 @@
> >   #include <linux/types.h>
> >   #include <vdso/time64.h>
> >   #include <linux/percpu.h>
> > +#include <asm/qspinlock.h>
> >
> >   struct qspinlock;
> > +typedef struct qspinlock rqspinlock_t;
> >
> >   extern int resilient_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val, u64 timeout);
> >
> > @@ -82,4 +84,60 @@ static __always_inline void release_held_lock_entry(void)
> >       this_cpu_dec(rqspinlock_held_locks.cnt);
> >   }
> >
> > +/**
> > + * res_spin_lock - acquire a queued spinlock
> > + * @lock: Pointer to queued spinlock structure
> > + */
> > +static __always_inline int res_spin_lock(rqspinlock_t *lock)
> > +{
> > +     int val = 0;
> > +
> > +     if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) {
> > +             grab_held_lock_entry(lock);
> > +             return 0;
> > +     }
> > +     return resilient_queued_spin_lock_slowpath(lock, val, RES_DEF_TIMEOUT);
> > +}
> > +
> > +static __always_inline void res_spin_unlock(rqspinlock_t *lock)
> > +{
> > +     struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks);
> > +
> > +     if (unlikely(rqh->cnt > RES_NR_HELD))
> > +             goto unlock;
> > +     WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL);
> > +     /*
> > +      * Release barrier, ensuring ordering. See release_held_lock_entry.
> > +      */
> > +unlock:
> > +     queued_spin_unlock(lock);
> > +     this_cpu_dec(rqspinlock_held_locks.cnt);
> > +}
> > +
> > +#define raw_res_spin_lock_init(lock) ({ *(lock) = (struct qspinlock)__ARCH_SPIN_LOCK_UNLOCKED; })
> > +
> > +#define raw_res_spin_lock(lock)                    \
> > +     ({                                         \
> > +             int __ret;                         \
> > +             preempt_disable();                 \
> > +             __ret = res_spin_lock(lock);       \
> > +             if (__ret)                         \
> > +                     preempt_enable();          \
> > +             __ret;                             \
> > +     })
> > +
> > +#define raw_res_spin_unlock(lock) ({ res_spin_unlock(lock); preempt_enable(); })
> > +
> > +#define raw_res_spin_lock_irqsave(lock, flags)    \
> > +     ({                                        \
> > +             int __ret;                        \
> > +             local_irq_save(flags);            \
> > +             __ret = raw_res_spin_lock(lock);  \
> > +             if (__ret)                        \
> > +                     local_irq_restore(flags); \
> > +             __ret;                            \
> > +     })
> > +
> > +#define raw_res_spin_unlock_irqrestore(lock, flags) ({ raw_res_spin_unlock(lock); local_irq_restore(flags); })
> > +
> >   #endif /* __ASM_GENERIC_RQSPINLOCK_H */
>
> Lockdep calls aren't included in the helper functions. That means all
> the *res_spin_lock*() calls will be outside the purview of lockdep. That
> also means a multi-CPU circular locking dependency involving a mixture
> of qspinlocks and rqspinlocks may not be detectable.

Yes, this is true, but I am not sure whether lockdep fits well in this
case, or how to map its semantics.
Some BPF users (e.g. in patch 17) expect and rely on rqspinlock to
return errors on AA deadlocks, as nesting is possible, so we'll get
false alarms with it. Lockdep also needs to treat rqspinlock as a
trylock, since it's essentially fallible, and IIUC it skips diagnosing
in those cases.
Most of the users use rqspinlock because it is expected a deadlock may
be constructed at runtime (either due to BPF programs or by attaching
programs to the kernel), so lockdep splats will not be helpful on
debug kernels.

Say if a mix of both qspinlock and rqspinlock were involved in an ABBA
situation, as long as rqspinlock is being acquired on one of the
threads, it will still timeout even if check_deadlock fails to
establish presence of a deadlock. This will mean the qspinlock call on
the other side will make progress as long as the kernel unwinds locks
correctly on failures (by handling rqspinlock errors and releasing
held locks on the way out).

>
> Cheers,
> Longman
>