[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50615EE4.1040809@linux.vnet.ibm.com>
Date: Tue, 25 Sep 2012 13:06:04 +0530
From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To: Avi Kivity <avi@...hat.com>
CC: Rik van Riel <riel@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Gleb Natapov <gleb@...hat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE
handler
On 09/24/2012 09:11 PM, Avi Kivity wrote:
> On 09/21/2012 08:24 PM, Raghavendra K T wrote:
>> On 09/21/2012 06:32 PM, Rik van Riel wrote:
>>> On 09/21/2012 08:00 AM, Raghavendra K T wrote:
>>>> From: Raghavendra K T<raghavendra.kt@...ux.vnet.ibm.com>
>>>>
>>>> When total number of VCPUs of system is less than or equal to physical
>>>> CPUs,
>>>> PLE exits become costly since each VCPU can have dedicated PCPU, and
>>>> trying to find a target VCPU to yield_to just burns time in PLE handler.
>>>>
>>>> This patch reduces overhead, by simply doing a return in such
>>>> scenarios by
>>>> checking the length of current cpu runqueue.
>>>
>>> I am not convinced this is the way to go.
>>>
>>> The VCPU that is holding the lock, and is not releasing it,
>>> probably got scheduled out. That implies that VCPU is on a
>>> runqueue with at least one other task.
>>
>> I see your point here, we have two cases:
>>
>> case 1)
>>
>> rq1 : vcpu1->wait(lockA) (spinning)
>> rq2 : vcpu2->holding(lockA) (running)
>>
>> Here Ideally vcpu1 should not enter PLE handler, since it would surely
>> get the lock within ple_window cycle. (assuming ple_window is tuned for
>> that workload perfectly).
>>
>> May be this explains why we are not seeing benefit with kernbench.
>>
>> On the other side, Since we cannot have a perfect ple_window tuned for
>> all type of workloads, for those workloads, which may need more than
>> 4096 cycles, we gain. thinking is it that we are seeing in benefited
>> cases?
>
> Maybe we need to increase the ple window regardless. 4096 cycles is 2
> microseconds or less (call it t_spin). The overhead from
> kvm_vcpu_on_spin() and the associated task switches is at least a few
> microseconds, increasing as contention is added (call it t_tield). The
> time for a natural context switch is several milliseconds (call it
> t_slice). There is also the time the lock holder owns the lock,
> assuming no contention (t_hold).
>
> If t_yield> t_spin, then in the undercommitted case it dominates
> t_spin. If t_hold> t_spin we lose badly.
>
> If t_spin> t_yield, then the undercommitted case doesn't suffer as much
> as most of the spinning happens in the guest instead of the host, so it
> can pick up the unlock timely. We don't lose too much in the
> overcommitted case provided the values aren't too far apart (say a
> factor of 3).
>
> Obviously t_spin must be significantly smaller than t_slice, otherwise
> it accomplishes nothing.
>
> Regarding t_hold: if it is small, then a larger t_spin helps avoid false
> exits. If it is large, then we're not very sensitive to t_spin. It
> doesn't matter if it takes us 2 usec or 20 usec to yield, if we end up
> yielding for several milliseconds.
>
> So I think it's worth trying again with ple_window of 20000-40000.
>
Agree that spinning is not costly and I have tried increasing
ple_window earlier. I 'll give one more shot.
I was thinking, unnessary spinning of vcpus (spinning when lockholder
is preempted), add up to degradation significantly, especially in
ticketlock scenario is more problemtic. no?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists