[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <500BF35D.6050605@linux.vnet.ibm.com>
Date: Sun, 22 Jul 2012 18:04:37 +0530
From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@...hat.com>, Avi Kivity <avi@...hat.com>,
Rik van Riel <riel@...hat.com>,
Christian Borntraeger <borntraeger@...ibm.com>
CC: "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
S390 <linux-s390@...r.kernel.org>,
Carsten Otte <cotte@...ibm.com>, KVM <kvm@...r.kernel.org>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>, X86 <x86@...nel.org>,
Gleb Natapov <gleb@...hat.com>, linux390@...ibm.com,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Joerg Roedel <joerg.roedel@....com>
Subject: Re: [PATCH RFC V5 0/3] kvm: Improving directed yield in PLE handler
On 07/20/2012 11:06 PM, Marcelo Tosatti wrote:
> On Wed, Jul 18, 2012 at 07:07:17PM +0530, Raghavendra K T wrote:
>>
>> Currently Pause Loop Exit (PLE) handler is doing directed yield to a
>> random vcpu on pl-exit. We already have filtering while choosing
>> the candidate to yield_to. This change adds more checks while choosing
>> a candidate to yield_to.
>>
>> On a large vcpu guests, there is a high probability of
>> yielding to the same vcpu who had recently done a pause-loop exit.
>> Such a yield can lead to the vcpu spinning again.
>>
>> The patchset keeps track of the pause loop exit and gives chance to a
>> vcpu which has:
>>
>> (a) Not done pause loop exit at all (probably he is preempted lock-holder)
>>
>> (b) vcpu skipped in last iteration because it did pause loop exit, and
>> probably has become eligible now (next eligible lock holder)
>>
>> This concept also helps in cpu relax interception cases which use same handler.
>>
>> Changes since V4:
>> - Naming Change (Avi):
>> struct ple ==> struct spin_loop
>> cpu_relax_intercepted ==> in_spin_loop
>> vcpu_check_and_update_eligible ==> vcpu_eligible_for_directed_yield
>> - mark vcpu in spinloop as not eligible to avoid influence of previous exit
>>
>> Changes since V3:
>> - arch specific fix/changes (Christian)
>>
>> Changes since v2:
>> - Move ple structure to common code (Avi)
>> - rename pause_loop_exited to cpu_relax_intercepted (Avi)
>> - add config HAVE_KVM_CPU_RELAX_INTERCEPT (Avi)
>> - Drop superfluous curly braces (Ingo)
>>
>> Changes since v1:
>> - Add more documentation for structure and algorithm and Rename
>> plo ==> ple (Rik).
>> - change dy_eligible initial value to false. (otherwise very first directed
>> yield will not be skipped. (Nikunj)
>> - fixup signoff/from issue
>>
>> Future enhancements:
>> (1) Currently we have a boolean to decide on eligibility of vcpu. It
>> would be nice if I get feedback on guest (>32 vcpu) whether we can
>> improve better with integer counter. (with counter = say f(log n )).
>>
>> (2) We have not considered system load during iteration of vcpu. With
>> that information we can limit the scan and also decide whether schedule()
>> is better. [ I am able to use #kicked vcpus to decide on this But may
>> be there are better ideas like information from global loadavg.]
>>
>> (3) We can exploit this further with PV patches since it also knows about
>> next eligible lock-holder.
>>
>> Summary: There is a very good improvement for kvm based guest on PLE machine.
>> The V5 has huge improvement for kbench.
>>
>> +-----------+-----------+-----------+------------+-----------+
>> base_rik stdev patched stdev %improve
>> +-----------+-----------+-----------+------------+-----------+
>> kernbench (time in sec lesser is better)
>> +-----------+-----------+-----------+------------+-----------+
>> 1x 49.2300 1.0171 22.6842 0.3073 117.0233 %
>> 2x 91.9358 1.7768 53.9608 1.0154 70.37516 %
>> +-----------+-----------+-----------+------------+-----------+
>>
>> +-----------+-----------+-----------+------------+-----------+
>> ebizzy (records/sec more is better)
>> +-----------+-----------+-----------+------------+-----------+
>> 1x 1129.2500 28.6793 2125.6250 32.8239 88.23334 %
>> 2x 1892.3750 75.1112 2377.1250 181.6822 25.61596 %
>> +-----------+-----------+-----------+------------+-----------+
>>
>> Note: The patches are tested on x86.
>>
>> Links
>> V4: https://lkml.org/lkml/2012/7/16/80
>> V3: https://lkml.org/lkml/2012/7/12/437
>> V2: https://lkml.org/lkml/2012/7/10/392
>> V1: https://lkml.org/lkml/2012/7/9/32
>>
>> Raghavendra K T (3):
>> config: Add config to support ple or cpu relax optimzation
>> kvm : Note down when cpu relax intercepted or pause loop exited
>> kvm : Choose a better candidate for directed yield
>> ---
>> arch/s390/kvm/Kconfig | 1 +
>> arch/x86/kvm/Kconfig | 1 +
>> include/linux/kvm_host.h | 39 +++++++++++++++++++++++++++++++++++++++
>> virt/kvm/Kconfig | 3 +++
>> virt/kvm/kvm_main.c | 41 +++++++++++++++++++++++++++++++++++++++++
>> 5 files changed, 85 insertions(+), 0 deletions(-)
>
> Reviewed-by: Marcelo Tosatti<mtosatti@...hat.com>
>
Thanks Marcelo for the review. Avi, Rik, Christian, please let me know
if this series looks good now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists