linux-kernel - Re: [PATCH v3] locking/pvqspinlock: restore/set vcpu_hashed state after failing adaptive locking spinning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CANRm+Cx2OnvU8Dq7Vfz5v_MgGktOqSQB71eRQJqb_rUjTEWp9g@mail.gmail.com>
Date:	Sat, 16 Jul 2016 09:12:23 +0800
From:	Wanpeng Li <kernellwp@...il.com>
To:	Waiman Long <waiman.long@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Wanpeng Li <wanpeng.li@...mail.com>,
	Ingo Molnar <mingo@...nel.org>,
	Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH v3] locking/pvqspinlock: restore/set vcpu_hashed state
 after failing adaptive locking spinning

2016-07-16 0:44 GMT+08:00 Waiman Long <waiman.long@....com>:
> On 07/15/2016 03:45 AM, Wanpeng Li wrote:
>>
>> 2016-07-15 15:09 GMT+08:00 Peter Zijlstra<peterz@...radead.org>:
>>>
>>> On Fri, Jul 15, 2016 at 05:26:40AM +0800, Wanpeng Li wrote:
>>>>
>>>> 2016-07-14 22:52 GMT+08:00 Waiman Long<waiman.long@....com>:
>>>> [...]
>>>>>
>>>>> As pv_kick_node() is called immediately after designating the next node
>>>>> as
>>>>> the queue head, the chance of this racing is possible, but is not
>>>>> likely
>>>>> unless the lock holder vCPU gets preempted for a long time at that
>>>>> right
>>>>> moment. This change does not do any harm though, so I am OK with that.
>>>>> However, I do want you to add a comment about the possible race in the
>>>>> code
>>>>> as it isn't that obvious or likely.
>>>>
>>>> How about something like:
>>>>
>>>> /*
>>>>   * If the lock holder vCPU gets preempted for a long time, pv_kick_node
>>>> will
>>>>   * advance its state and hash the lock, restore/set the vcpu_hashed
>>>> state to
>>>>   * avoid the race.
>>>>   */
>>>
>>> So I'm not sure. Yes it was a bug, but its fairly 'obvious' it should be
>>
>> I believe Waiman can give a better comments. :)
>
>
> Yes, setting the state to vcpu_hashed is the more obvious choice. What I
> said is not obvious is that there can be a race between the new lock holder
> in pv_kick_node() and the new queue head trying to call pv_wait(). And it is
> what I want to document it. Maybe something more graphical can help:
>
> /*
>  * lock holder vCPU             queue head vCPU
>  * ----------------             ---------------
>  * node->locked = 1;
>  * <preemption>                 READ_ONCE(node->locked)
>  *    ...                       pv_wait_head_or_lock():
>  *                                SPIN_THRESHOLD loop;
>  *                                pv_hash();
>  *                                lock->locked = _Q_SLOW_VAL;
>  *                                node->state  = vcpu_hashed;
>  * pv_kick_node():
>  *   cmpxchg(node->state,
>  *      vcpu_halted, vcpu_hashed);
>  *   lock->locked = _Q_SLOW_VAL;
>  *   pv_hash();
>  *
>  * With preemption at the right moment, it is possible that both the
>  * lock holder and queue head vCPUs can be racing to set node->state.
>  * Making sure the state is never set to vcpu_halted will prevent this
>  * racing from happening.
>  */

Thanks, I will fold this in my patch. :)

Regards,
Wanpeng Li