[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20171017203759.GZ3521@linux.vnet.ibm.com>
Date: Tue, 17 Oct 2017 13:37:59 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Will Deacon <will.deacon@....com>
Cc: Boqun Feng <boqun.feng@...il.com>, stern@...land.harvard.edu,
parri.andrea@...il.com, peterz@...radead.org, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
linux-kernel@...r.kernel.org
Subject: Re: Linux-kernel examples for LKMM recipes
On Thu, Oct 12, 2017 at 12:27:19PM +0100, Will Deacon wrote:
> On Thu, Oct 12, 2017 at 09:23:59AM +0800, Boqun Feng wrote:
> > On Wed, Oct 11, 2017 at 10:32:30PM +0000, Paul E. McKenney wrote:
> > > I am not aware of any three-CPU release-acquire chains in the
> > > Linux kernel. There are three-CPU lock-based chains in RCU,
> > > but these are not at all simple, either.
> > >
> >
> > The "Program-Order guarantees" case in scheduler? See the comments
> > written by Peter above try_to_wake_up():
> >
> > * The basic program-order guarantee on SMP systems is that when a task [t]
> > * migrates, all its activity on its old CPU [c0] happens-before any subsequent
> > * execution on its new CPU [c1].
> > ...
> > * For blocking we (obviously) need to provide the same guarantee as for
> > * migration. However the means are completely different as there is no lock
> > * chain to provide order. Instead we do:
> > *
> > * 1) smp_store_release(X->on_cpu, 0)
> > * 2) smp_cond_load_acquire(!X->on_cpu)
> > *
> > * Example:
> > *
> > * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
> > *
> > * LOCK rq(0)->lock LOCK X->pi_lock
> > * dequeue X
> > * sched-out X
> > * smp_store_release(X->on_cpu, 0);
> > *
> > * smp_cond_load_acquire(&X->on_cpu, !VAL);
> > * X->state = WAKING
> > * set_task_cpu(X,2)
> > *
> > * LOCK rq(2)->lock
> > * enqueue X
> > * X->state = RUNNING
> > * UNLOCK rq(2)->lock
> > *
> > * LOCK rq(2)->lock // orders against CPU1
> > * sched-out Z
> > * sched-in X
> > * UNLOCK rq(2)->lock
> > *
> > * UNLOCK X->pi_lock
> > * UNLOCK rq(0)->lock
> >
> > This is a chain mixed with lock and acquire-release(maybe even better?).
> >
> >
> > And another example would be osq_{lock,unlock}() on multiple(more than
> > three) CPUs.
>
> I think the qrwlock also has something similar with the writer fairness
> issue fixed:
>
> CPU0: (writer doing an unlock)
> smp_store_release(&lock->wlocked, 0); // Bottom byte of lock->cnts
>
>
> CPU1: (waiting writer on slowpath)
> atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> ...
> arch_spin_unlock(&lock->wait_lock);
>
>
> CPU2: (reader on slowpath)
> arch_spin_lock(&lock->wait_lock);
>
> and there's mixed-size accesses here too. Fun stuff!
You had me going there until you mentioned the mixed-size accesses. ;-)
Thanx, Paul
Powered by blists - more mailing lists