linux-kernel - Re: [PATCH v10 03/19] qspinlock: Add pending bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140514165121.GA21370@potion.redhat.com>
Date:	Wed, 14 May 2014 18:51:24 +0200
From:	Radim Krčmář <rkrcmar@...hat.com>
To:	Waiman Long <waiman.long@...com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-arch@...r.kernel.org, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	virtualization@...ts.linux-foundation.org,
	xen-devel@...ts.xenproject.org, kvm@...r.kernel.org,
	Paolo Bonzini <paolo.bonzini@...il.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Gleb Natapov <gleb@...hat.com>,
	Scott J Norton <scott.norton@...com>,
	Chegu Vinod <chegu_vinod@...com>
Subject: Re: [PATCH v10 03/19] qspinlock: Add pending bit

2014-05-13 15:47-0400, Waiman Long:
> On 05/12/2014 11:22 AM, Radim Krčmář wrote:
> >I think there is an unwanted scenario on virtual machines:
> >1) VCPU sets the pending bit and start spinning.
> >2) Pending VCPU gets descheduled.
> >     - we have PLE and lock holder isn't running [1]
> >     - the hypervisor randomly preempts us
> >3) Lock holder unlocks while pending VCPU is waiting in queue.
> >4) Subsequent lockers will see free lock with set pending bit and will
> >    loop in trylock's 'for (;;)'
> >     - the worst-case is lock starving [2]
> >     - PLE can save us from wasting whole timeslice
> >
> >Retry threshold is the easiest solution, regardless of its ugliness [4].
> 
> Yes, that can be a real issue. Some sort of retry threshold, as you said,
> should be able to handle it.
> 
> BTW, the relevant patch should be 16/19 where the PV spinlock stuff should
> be discussed. This patch is perfectly fine.

Ouch, my apology to Peter didn't make it ... Agreed, I should have split
the comment under patches
 [06/19] (part quoted above; does not depend on PV),
 [16/19] (part quoted below) and
 [17/19] (general doubts).

> >Another minor design flaw is that formerly first VCPU gets appended to
> >the tail when it decides to queue;
> >is the performance gain worth it?
> >
> >Thanks.
> 
> Yes, the performance gain is worth it. The primary goal is to be not worse
> than ticket spinlock in light load situation which is the most common case.
> This feature is need to achieve that.

Ok.
I've seen merit in pvqspinlock even with slightly slower first-waiter,
so I would have happily sacrificed those horrible branches.
(I prefer elegant to optimized code, but I can see why we want to be
 strictly better than ticketlock.)
Peter mentioned that we are focusing on bare-metal patches, so I'll
withold my other paravirt rants until they are polished.

And to forcefully bring this thread a little bit on-topic:

Pending-bit is effectively a lock in a lock, so I was wondering why
don't we use more pending bits; advantages are the same, just diminished
by the probability of having an ideally contended lock:
 - waiter won't be blocked on RAM access if critical section (or more)
   ends sooner
 - some unlucky cacheline is not forgotten
 - faster unlock (no need for tail operations)
(- ?)
disadvantages are magnified:
 - increased complexity
 - intense cacheline sharing
   (I thought that this is the main disadvantage of ticketlock.)
(- ?)

One bit still improved performance, is it the best we got?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/