lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55F8394A.1010809@hpe.com>
Date:	Tue, 15 Sep 2015 11:29:14 -0400
From:	Waiman Long <waiman.long@....com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org, Scott J Norton <scott.norton@...com>,
	Douglas Hatch <doug.hatch@...com>,
	Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

On 09/15/2015 04:24 AM, Peter Zijlstra wrote:
> On Mon, Sep 14, 2015 at 03:15:20PM -0400, Waiman Long wrote:
>> On 09/14/2015 10:00 AM, Peter Zijlstra wrote:
>>> On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:
>>>> This patch allows one attempt for the lock waiter to steal the lock
>                        ^^^
>
>>>> when entering the PV slowpath.  This helps to reduce the performance
>>>> penalty caused by lock waiter preemption while not having much of
>>>> the downsides of a real unfair lock.
>>>> @@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
>>>>
>>>>   	for (;; waitcnt++) {
>>>>   		for (loop = SPIN_THRESHOLD; loop; loop--) {
>>>> -			if (!READ_ONCE(l->locked))
>>>> -				return;
>>>> +			/*
>>>> +			 * Try to acquire the lock when it is free.
>>>> +			 */
>>>> +			if (!READ_ONCE(l->locked)&&
>>>> +			   (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
>>>> +				goto gotlock;
>>>>   			cpu_relax();
>>>>   		}
>>>>
>>> This isn't _once_, this is once per 'wakeup'. And note that interrupts
>>> unrelated to the kick can equally wake the vCPU up.
>>> void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
>>> {
>>>      :
>>>          /*
>>>          * We touched a (possibly) cold cacheline in the per-cpu queue node;
>>>          * attempt the trylock once more in the hope someone let go while we
>>>          * weren't watching.
>>>          */
>>>         if (queued_spin_trylock(lock))
>>>                 goto release;
>> This is the only place where I consider lock stealing happens. Again, I
>> should have a comment in pv_queued_spin_trylock_unfair() to say where it
>> will be called.
> But you're not adding that..
>
> What you did add is a steal in pv_wait_head(), and its not even once per
> pv_wait_head, its inside the spin loop (I read it wrong yesterday).
>
> So that makes the entire Changelog complete crap. There isn't _one_
> attempt, and there is absolutely no fairness left.

Only the queue head vCPU will be in pv_wait_head() spinning to acquire 
the lock. The other vCPUs in the queue will still be spinning on their 
MCS nodes. The only competitors for the lock are those vCPUs that have 
just entered slowpath and execute the queued_spin_trylock() function 
once before being queued. That is what I mean by each task having only 
one chance of stealing the lock. Maybe the following code changes can 
make this point clearer.

Cheers,
Longman

--------------------------------------------------------------------------------------------------------

--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -59,7 +59,8 @@ struct pv_node {
  /*
   * Allow one unfair trylock when entering the PV slowpath to reduce the
   * performance impact of lock waiter preemption (either explicitly via
- * pv_wait or implicitly via PLE).
+ * pv_wait or implicitly via PLE). This function will be called once when
+ * a lock waiter enter the slowpath before being queued.
   *
   * A little bit of unfairness here can improve performance without many
   * of the downsides of a real unfair lock.
@@ -72,8 +73,8 @@ static inline bool 
pv_queued_spin_trylock_unfair(struct qspinl
         if (READ_ONCE(l->locked))
                 return 0;
         /*
-        * Wait a bit here to ensure that an actively spinning vCPU has 
a fair
-        * chance of getting the lock.
+        * Wait a bit here to ensure that an actively spinning queue 
head vCPU
+        * has a fair chance of getting the lock.
          */
         cpu_relax();

@@ -504,14 +505,23 @@ static int pv_wait_head_and_lock(struct qspinlock 
*lock,
                  */
                 WRITE_ONCE(pn->state, vcpu_running);

-               for (loop = SPIN_THRESHOLD; loop; loop--) {
+               loop = SPIN_THRESHOLD;
+               while (loop) {
                         /*
-                        * Try to acquire the lock when it is free.
+                        * Spin until the lock is free
                          */
-                       if (!READ_ONCE(l->locked) &&
-                          (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
+                       for (; loop && READ_ONCE(l->locked); loop--)
+                               cpu_relax();
+                       /*
+                        * Seeing the lock is free, this queue head vCPU is
+                        * the rightful next owner of the lock. However, the
+                        * lock may have just been stolen by another 
task which
+                        * has entered the slowpath. So we need to use 
atomic
+                        * operation to make sure that we really get the 
lock.
+                        * Otherwise, we have to wait again.
+                        */
+                       if (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)
                                 goto gotlock;
-                       cpu_relax();
                 }

                 if (!lp) { /* ONCE */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ