linux-kernel - Re: [PATCH v3] locking/pvqspinlock: restore/set vcpu_hashed state after failing adaptive locking spinning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57891301.8040809@hpe.com>
Date:	Fri, 15 Jul 2016 12:44:49 -0400
From:	Waiman Long <waiman.long@....com>
To:	Wanpeng Li <kernellwp@...il.com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Wanpeng Li <wanpeng.li@...mail.com>,
	Ingo Molnar <mingo@...nel.org>,
	Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH v3] locking/pvqspinlock: restore/set vcpu_hashed state
 after failing adaptive locking spinning

On 07/15/2016 03:45 AM, Wanpeng Li wrote:
> 2016-07-15 15:09 GMT+08:00 Peter Zijlstra<peterz@...radead.org>:
>> On Fri, Jul 15, 2016 at 05:26:40AM +0800, Wanpeng Li wrote:
>>> 2016-07-14 22:52 GMT+08:00 Waiman Long<waiman.long@....com>:
>>> [...]
>>>> As pv_kick_node() is called immediately after designating the next node as
>>>> the queue head, the chance of this racing is possible, but is not likely
>>>> unless the lock holder vCPU gets preempted for a long time at that right
>>>> moment. This change does not do any harm though, so I am OK with that.
>>>> However, I do want you to add a comment about the possible race in the code
>>>> as it isn't that obvious or likely.
>>> How about something like:
>>>
>>> /*
>>>   * If the lock holder vCPU gets preempted for a long time, pv_kick_node will
>>>   * advance its state and hash the lock, restore/set the vcpu_hashed state to
>>>   * avoid the race.
>>>   */
>> So I'm not sure. Yes it was a bug, but its fairly 'obvious' it should be
> I believe Waiman can give a better comments. :)

Yes, setting the state to vcpu_hashed is the more obvious choice. What I 
said is not obvious is that there can be a race between the new lock 
holder in pv_kick_node() and the new queue head trying to call 
pv_wait(). And it is what I want to document it. Maybe something more 
graphical can help:

/*
  * lock holder vCPU             queue head vCPU
  * ----------------             ---------------
  * node->locked = 1;
  * <preemption>                 READ_ONCE(node->locked)
  *    ...                       pv_wait_head_or_lock():
  *                                SPIN_THRESHOLD loop;
  *                                pv_hash();
  *                                lock->locked = _Q_SLOW_VAL;
  *                                node->state  = vcpu_hashed;
  * pv_kick_node():
  *   cmpxchg(node->state,
  *      vcpu_halted, vcpu_hashed);
  *   lock->locked = _Q_SLOW_VAL;
  *   pv_hash();
  *
  * With preemption at the right moment, it is possible that both the
  * lock holder and queue head vCPUs can be racing to set node->state.
  * Making sure the state is never set to vcpu_halted will prevent this
  * racing from happening.
  */

Cheers,
Longman