lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <574DD45C.6000601@hpe.com>
Date:	Tue, 31 May 2016 14:13:48 -0400
From:	Waiman Long <waiman.long@....com>
To:	xinhui <xinhui.pan@...ux.vnet.ibm.com>
CC:	<linux-kernel@...r.kernel.org>, <peterz@...radead.org>,
	<mingo@...hat.com>
Subject: Re: [PATCH] pv-qspinlock: Try to re-hash the lock after spurious_wakeup

On 05/30/2016 04:53 AM, xinhui wrote:
>
>
> On 2016年05月28日 11:41, Waiman Long wrote:
>> On 05/27/2016 06:32 AM, xinhui wrote:
>>>
>>> On 2016年05月27日 02:31, Waiman Long wrote:
>>>> On 05/25/2016 02:09 AM, Pan Xinhui wrote:
>>>>> In pv_wait_head_or_lock, if there is a spurious_wakeup, and it 
>>>>> fails to
>>>>> get the lock as there is lock stealing, then after a short spin, 
>>>>> we need
>>>>> hash the lock again and enter pv_wait to yield.
>>>>>
>>>>> Currently after a spurious_wakeup, as l->locked is not _Q_SLOW_VAL,
>>>>> pv_wait might do nothing and return directly, that is not
>>>>> paravirt-friendly because pv_wait_head_or_lock will just spin on the
>>>>> lock then.
>>>>>
>>>>> Signed-off-by: Pan Xinhui<xinhui.pan@...ux.vnet.ibm.com>
>>>>> ---
>>>>>   kernel/locking/qspinlock_paravirt.h | 39 
>>>>> +++++++++++++++++++++++++++++--------
>>>>>   1 file changed, 31 insertions(+), 8 deletions(-)
>>>>
>>>> Is this a problem you can easily reproduce on PPC? I have not 
>>>> observed this issue when testing on x86.
>>>>
>>> Hi, Waiman
>>>     I notice the spurious_wakeup count is very high when I do 
>>> benchmark tests and stress tests. So after a simple investigation,
>>> I find pv_wait_head_or_lock() just keep loops.
>>>
>>
>> That shouldn't happen in normal case. When testing on x86, I 
>> typically get the following stat data for an over-commited guest:
>>
>> pv_lock_slowpath=9256211
>> pv_lock_stealing=36398363
>> pv_spurious_wakeup=311
>> pv_wait_again=294
>> pv_wait_early=3255605
>> pv_wait_head=173
>> pv_wait_node=3256280
>>
> OK, here is the result after run command  perf bench sched messaging 
> -g 512
>
> pv_lock_slowpath=2331407
> pv_lock_stealing=192038
> pv_spurious_wakeup=236319
> pv_wait_again=215668
> pv_wait_early=177299
> pv_wait_head=9206
> pv_wait_node=228781
>

Is the high spurious wakeup caused by the way PPC schedules processor 
resources to vCPUs? In x86, once the vCPU voluntarily sleep, it won't 
get woken up until there is an explicit vCPU kick request. It may not be 
the case for PPC, then. That may explain the high spurious wakeup number.

>> The queue head don't call pv_wait that often. There are a bit of 
>> spurious wakeup, but it is mostly caused by lock stealing. How long 
>> is a cpu_relax() in PPC takes?
>>
> 946012160 cpu_relax loops with 10 seconds. So if SPIN_THRESHOLD is 
> 1<<15, it costs 0.3ms to spin on the lock. How about x86?
>

For x86, one measurement that I got in the past is that each cpu_relax() 
loop took about 3ns. So the full spin will take about 0.9ms.

> And only 10134976 pv_wait/pv_kick hyper-call loops within 10 seconds. 
> so every hyper-call itself(the so-called latency) costs less than 1us.
>

The hypercall is much slower in x86. it is about 10-20 us for pv_kick 
and up to 100us for pv_kick=>pv_wait.

>>>     Here is my story, in my pv-qspinlcok patchset V1&&v2, pv_wait on 
>>> ppc ignore the first two parameters of *ptr and val, that makes 
>>> lock_stealing hit too much.
>>
>> The pvqspinlock code does depend on pv_wait() doing a final check to 
>> see if the lock value change. The code may not work reliably without 
>> that.
>>
> agree, So pv_wait now do the check of *ptr and val.
>
>>> and when I change SPIN_THRESHOLD to a small value, system is very 
>>> much unstable because waiter will enter pv_wait quickly and no one 
>>> will kick waiter's cpu if
>>> we enter pv_wait twice thanks to the lock_stealing.
>>>     So what I do in my pv-qspinlcok patchset V3 is that add if (*ptr 
>>> == val) in pv_wait. However as I mentioned above, then 
>>> spurious_wakeup count is too high, that also means our cpu
>>> slice is wasted.
>>
>> The SPIN_THRESHOLD should be sufficiently big. A small value will 
>> cause too many waits and wake-up's which may not be good. Anyway, 
>> more testing and tuning may be needed to make the pvqspinlock code 
>> work well with PPC.
>>
> agree , but I think the SPIN_THRESHOLD (1<<15) for ppc is a little large.
>
> I even come up with an idea that make SPIN_THRESHOLD an extern 
> variable on ppc. But I am busy and I wonder if it's worth doing that.

The purpose of the SPIN_THRESHOLD is to make sure that the vCPU won't 
call pv_wait() if the vCPUs in the guest aren't over-commited. The 
situation may be a bit different in PPC. So you need to make a decision 
as to how large the SPIN_THRESHOLD should be.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ