[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55A7C506.9030309@hp.com>
Date: Thu, 16 Jul 2015 10:51:50 -0400
From: Waiman Long <waiman.long@...com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Scott J Norton <scott.norton@...com>,
Douglas Hatch <doug.hatch@...com>,
Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH v2 4/6] locking/pvqspinlock: Allow vCPUs kick-ahead
On 07/16/2015 01:46 AM, Peter Zijlstra wrote:
> On Wed, Jul 15, 2015 at 10:01:02PM -0400, Waiman Long wrote:
>> On 07/15/2015 05:39 AM, Peter Zijlstra wrote:
>>> On Tue, Jul 14, 2015 at 10:13:35PM -0400, Waiman Long wrote:
>>>> Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens
>>>> critical section and block forward progress. This patch implements
>>>> a kick-ahead mechanism where the unlocker will kick the queue head
>>>> vCPUs as well as up to four additional vCPUs next to the queue head
>>>> if they were halted. The kickings are done after exiting the critical
>>>> section to improve parallelism.
>>>>
>>>> The amount of kick-ahead allowed depends on the number of vCPUs
>>>> in the VM guest. This patch, by itself, won't do much as most of
>>>> the kickings are currently done at lock time. Coupled with the next
>>>> patch that defers lock time kicking to unlock time, it should improve
>>>> overall system performance in a busy overcommitted guest.
>>>>
>>>> Linux kernel builds were run in KVM guest on an 8-socket, 4
>>>> cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
>>>> Haswell-EX system. Both systems are configured to have 32 physical
>>>> CPUs. The kernel build times before and after the patch were:
>>>>
>>>> Westmere Haswell
>>>> Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs
>>>> ----- -------- -------- -------- --------
>>>> Before patch 3m25.0s 10m34.1s 2m02.0s 15m35.9s
>>>> After patch 3m27.4s 10m32.0s 2m00.8s 14m52.5s
>>>>
>>>> There wasn't too much difference before and after the patch.
>>> That means either the patch isn't worth it, or as you seem to imply its
>>> in the wrong place in this series.
>> It needs to be coupled with the next patch to be effective as most of the
>> kicking are happening at the lock side, instead of at the unlock side. If
>> you look at the sample pvqspinlock stats in patch 3:
>>
>> lock_kick_count=755354
>> unlock_kick_count=87
>>
>> The number of unlock kicks is negligible compared with the lock kicks. Patch
>> 5 does have a dependency on patch 4 unless we make it unconditionally defers
>> kicking to the unlock call which was what I had done in the v1 patch. The
>> reason why I change this in v2 is because I found a very slight performance
>> degradation in doing so.
> This way we cannot see the gains of the proposed complexity. So put it
> in a place where you can.
OK, I will see what I can do to make the performance change more visible
on a patch-by-patch basis.
>>> You also do not offer any support for any of the magic numbers..
>> I chose 4 for PV_KICK_AHEAD_MAX as I didn't see much performance difference
>> when I did a kick-ahead of 5. Also, it may be too unfair to the vCPU that
>> was doing the kicking if the number is too big. Another magic number is
>> pv_kick_ahead number. This one is kind of arbitrary. Right now I do a log2,
>> but it can be divided by 4 (rshift 2) as well.
> So what was the difference between 1-2-3-4 ? I would be thinking one
> extra kick is the biggest help, no?
I was seeing diminishing returns with more kicks. I can add a table on
that in the next patch.
Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists