linux-kernel - Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150413150832.GH5029@twins.programming.kicks-ass.net>
Date:	Mon, 13 Apr 2015 17:08:32 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Waiman Long <waiman.long@...com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, linux-arch@...r.kernel.org,
	x86@...nel.org, linux-kernel@...r.kernel.org,
	virtualization@...ts.linux-foundation.org,
	xen-devel@...ts.xenproject.org, kvm@...r.kernel.org,
	Paolo Bonzini <paolo.bonzini@...il.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Daniel J Blueman <daniel@...ascale.com>,
	Scott J Norton <scott.norton@...com>,
	Douglas Hatch <doug.hatch@...com>
Subject: Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support
 for the qspinlock

On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:

> >>+static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
> >>+{
> >>+	struct __qspinlock *l = (void *)lock;
> >>+	struct qspinlock **lp = NULL;
> >>+	struct pv_node *pn = (struct pv_node *)node;
> >>+	int slow_set = false;
> >>+	int loop;
> >>+
> >>+	for (;;) {
> >>+		for (loop = SPIN_THRESHOLD; loop; loop--) {
> >>+			if (!READ_ONCE(l->locked))
> >>+				return;
> >>+
> >>+			cpu_relax();
> >>+		}
> >>+
> >>+		WRITE_ONCE(pn->state, vcpu_halted);
> >>+		if (!lp)
> >>+			lp = pv_hash(lock, pn);
> >>+		/*
> >>+		 * lp must be set before setting _Q_SLOW_VAL
> >>+		 *
> >>+		 * [S] lp = lock                [RmW] l = l->locked = 0
> >>+		 *     MB                             MB
> >>+		 * [S] l->locked = _Q_SLOW_VAL  [L]   lp
> >>+		 *
> >>+		 * Matches the cmpxchg() in pv_queue_spin_unlock().
> >>+		 */
> >>+		if (!slow_set&&
> >>+		    !cmpxchg(&l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL)) {
> >>+			/*
> >>+			 * The lock is free and _Q_SLOW_VAL has never been
> >>+			 * set. Need to clear the hash bucket before getting
> >>+			 * the lock.
> >>+			 */
> >>+			WRITE_ONCE(*lp, NULL);
> >>+			return;
> >>+		} else if (slow_set&&  !READ_ONCE(l->locked))
> >>+			return;
> >>+		slow_set = true;

> >I'm somewhat puzzled by the slow_set thing; what is wrong with the thing
> >I had, namely:
> >
> >		if (!lp) {
> >			lp = pv_hash(lock, pn);
> >
> >			/*
> >			 * comment
> >			 */
> >			lv = cmpxchg(&l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL);
> >			if (lv != _Q_LOCKED_VAL) {
> >				/* we're woken, unhash and return */
> >				WRITE_ONCE(*lp, NULL);
> >				return;
> >			}
> >		}
> >>+
> >>+		pv_wait(&l->locked, _Q_SLOW_VAL);
> >
> >If we get a spurious wakeup (due to device interrupts or random kick)
> >we'll loop around but ->locked will remain _Q_SLOW_VAL.
> 
> The purpose of the slow_set flag is not about the lock value. It is to make
> sure that pv_hash_find() will always find a match. Consider the following
> scenario:
> 
> cpu1            cpu2                    cpu3
> ----            ----                    ----
> pv_wait
> spurious wakeup
> loop l->locked
> 
>                 read _Q_SLOW_VAL
>                 pv_hash_find()
>                 unlock
> 
>                                         pv_hash() <- same entry
> 
> cmpxchg(&l->locked)
> clear hash (?)
> 
> Here, the entry for cpu3 will be removed leading to panic when
> pv_hash_find() can find the entry. So the hash entry can only be
> removed if the other cpu has no chance to see _Q_SLOW_VAL.

Still confused. Afaict that cannot happen with my code. We only do the
cmpxchg(, _Q_SLOW_VAL) _once_.

Only on the first time around that loop will we hash the lock and set
the slow flag. And cpu3 cannot come in on the same entry because we've
not yet released the lock when we find and unhash.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/