linux-kernel - Re: Linux-kernel examples for LKMM recipes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20171017203759.GZ3521@linux.vnet.ibm.com>
Date:   Tue, 17 Oct 2017 13:37:59 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Will Deacon <will.deacon@....com>
Cc:     Boqun Feng <boqun.feng@...il.com>, stern@...land.harvard.edu,
        parri.andrea@...il.com, peterz@...radead.org, npiggin@...il.com,
        dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
        linux-kernel@...r.kernel.org
Subject: Re: Linux-kernel examples for LKMM recipes

On Thu, Oct 12, 2017 at 12:27:19PM +0100, Will Deacon wrote:
> On Thu, Oct 12, 2017 at 09:23:59AM +0800, Boqun Feng wrote:
> > On Wed, Oct 11, 2017 at 10:32:30PM +0000, Paul E. McKenney wrote:
> > > 	I am not aware of any three-CPU release-acquire chains in the
> > > 	Linux kernel.  There are three-CPU lock-based chains in RCU,
> > > 	but these are not at all simple, either.
> > > 
> > 
> > The "Program-Order guarantees" case in scheduler? See the comments
> > written by Peter above try_to_wake_up():
> > 
> >  * The basic program-order guarantee on SMP systems is that when a task [t]
> >  * migrates, all its activity on its old CPU [c0] happens-before any subsequent
> >  * execution on its new CPU [c1].
> > ...
> >  * For blocking we (obviously) need to provide the same guarantee as for
> >  * migration. However the means are completely different as there is no lock
> >  * chain to provide order. Instead we do:
> >  *
> >  *   1) smp_store_release(X->on_cpu, 0)
> >  *   2) smp_cond_load_acquire(!X->on_cpu)
> >  *
> >  * Example:
> >  *
> >  *   CPU0 (schedule)  CPU1 (try_to_wake_up) CPU2 (schedule)
> >  *
> >  *   LOCK rq(0)->lock LOCK X->pi_lock
> >  *   dequeue X
> >  *   sched-out X
> >  *   smp_store_release(X->on_cpu, 0);
> >  *
> >  *                    smp_cond_load_acquire(&X->on_cpu, !VAL);
> >  *                    X->state = WAKING
> >  *                    set_task_cpu(X,2)
> >  *
> >  *                    LOCK rq(2)->lock
> >  *                    enqueue X
> >  *                    X->state = RUNNING
> >  *                    UNLOCK rq(2)->lock
> >  *
> >  *                                          LOCK rq(2)->lock // orders against CPU1
> >  *                                          sched-out Z
> >  *                                          sched-in X
> >  *                                          UNLOCK rq(2)->lock
> >  *
> >  *                    UNLOCK X->pi_lock
> >  *   UNLOCK rq(0)->lock
> > 
> > This is a chain mixed with lock and acquire-release(maybe even better?).
> > 
> > 
> > And another example would be osq_{lock,unlock}() on multiple(more than
> > three) CPUs. 
> 
> I think the qrwlock also has something similar with the writer fairness
> issue fixed:
> 
> CPU0: (writer doing an unlock)
> smp_store_release(&lock->wlocked, 0);	// Bottom byte of lock->cnts
> 
> 
> CPU1: (waiting writer on slowpath)
> atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> ...
> arch_spin_unlock(&lock->wait_lock);
> 
> 
> CPU2: (reader on slowpath)
> arch_spin_lock(&lock->wait_lock);
> 
> and there's mixed-size accesses here too. Fun stuff!

You had me going there until you mentioned the mixed-size accesses.  ;-)

							Thanx, Paul