[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200123101649.GF14946@hirez.programming.kicks-ass.net>
Date: Thu, 23 Jan 2020 11:16:49 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Alex Kogan <alex.kogan@...cle.com>
Cc: linux@...linux.org.uk, mingo@...hat.com, will.deacon@....com,
arnd@...db.de, longman@...hat.com, linux-arch@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
tglx@...utronix.de, bp@...en8.de, hpa@...or.com, x86@...nel.org,
guohanjun@...wei.com, jglauber@...vell.com,
steven.sistare@...cle.com, daniel.m.jordan@...cle.com,
dave.dice@...cle.com
Subject: Re: [PATCH v9 3/5] locking/qspinlock: Introduce CNA into the slow
path of qspinlock
On Thu, Jan 23, 2020 at 11:06:35AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 10:26:58AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 14, 2020 at 10:59:18PM -0500, Alex Kogan wrote:
> > > +/* this function is called only when the primary queue is empty */
> > > +static inline bool cna_try_change_tail(struct qspinlock *lock, u32 val,
> > > + struct mcs_spinlock *node)
> > > +{
> > > + struct mcs_spinlock *head_2nd, *tail_2nd;
> > > + u32 new;
> > > +
> > > + /* If the secondary queue is empty, do what MCS does. */
> > > + if (node->locked <= 1)
> > > + return __try_clear_tail(lock, val, node);
> > > +
> > > + /*
> > > + * Try to update the tail value to the last node in the secondary queue.
> > > + * If successful, pass the lock to the first thread in the secondary
> > > + * queue. Doing those two actions effectively moves all nodes from the
> > > + * secondary queue into the main one.
> > > + */
> > > + tail_2nd = decode_tail(node->locked);
> > > + head_2nd = tail_2nd->next;
> > > + new = ((struct cna_node *)tail_2nd)->encoded_tail + _Q_LOCKED_VAL;
> > > +
> > > + if (atomic_try_cmpxchg_relaxed(&lock->val, &val, new)) {
> > > + /*
> > > + * Try to reset @next in tail_2nd to NULL, but no need to check
> > > + * the result - if failed, a new successor has updated it.
> > > + */
> >
> > I think you actually have an ordering bug here; the load of head_2nd
> > *must* happen before the atomic_try_cmpxchg(), otherwise it might
> > observe the new next and clear a valid next pointer.
> >
> > What would be the best fix for that; I'm thinking:
> >
> > head_2nd = smp_load_acquire(&tail_2nd->next);
> >
> > Will?
>
> Hmm, given we've not passed the lock around yet; why wouldn't something
> like this work:
>
> smp_store_release(&tail_2nd->next, NULL);
Argh, make that:
tail_2nd->next = NULL;
smp_wmb();
> if (!atomic_try_cmpxchg_relaxed(&lock, &val, new)) {
> tail_2nd->next = head_2nd;
> return false;
> }
>
> The whole second queue is only ever modified by the lock owner, and that
> is us, so we can pre-terminate the secondary queue (break the circular
> link), try the cmpxchg and fix it back up when it fails.
Powered by blists - more mailing lists