linux-kernel - Re: [PATCH v2 2/3] KVM: dynamic halt_poll

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALzav=eGOfniScPsR-_h=HOc6DZgZZRR7A5LCncNSTVQ5sdKhg@mail.gmail.com>
Date:	Thu, 27 Aug 2015 09:25:51 -0700
From:	David Matlack <dmatlack@...gle.com>
To:	Wanpeng Li <wanpeng.li@...mail.com>
Cc:	Paolo Bonzini <pbonzini@...hat.com>,
	kvm list <kvm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/3] KVM: dynamic halt_poll_ns adjustment

On Thu, Aug 27, 2015 at 2:59 AM, Wanpeng Li <wanpeng.li@...mail.com> wrote:
> Hi David,
> On 8/26/15 1:19 AM, David Matlack wrote:
>>
>> Thanks for writing v2, Wanpeng.
>>
>> On Mon, Aug 24, 2015 at 11:35 PM, Wanpeng Li <wanpeng.li@...mail.com>
>> wrote:
>>>
>>> There is a downside of halt_poll_ns since poll is still happen for idle
>>> VCPU which can waste cpu usage. This patch adds the ability to adjust
>>> halt_poll_ns dynamically.
>>
>> What testing have you done with these patches? Do you know if this removes
>> the overhead of polling in idle VCPUs? Do we lose any of the performance
>> from always polling?
>>
>>> There are two new kernel parameters for changing the halt_poll_ns:
>>> halt_poll_ns_grow and halt_poll_ns_shrink. A third new parameter,
>>> halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally
>>> rounded down to a closest multiple of halt_poll_ns_grow. The shrink/grow
>>> matrix is suggested by David:
>>>
>>> if (poll successfully for interrupt): stay the same
>>>    else if (length of kvm_vcpu_block is longer than halt_poll_ns_max):
>>> shrink
>>>    else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow
>>
>> The way you implemented this wasn't what I expected. I thought you would
>> time
>> the whole function (kvm_vcpu_block). But I like your approach better. It's
>> simpler and [by inspection] does what we want.
>
>
> I see there is more idle vCPUs overhead w/ this method even more than always
> halt-poll, so I bring back grow vcpu->halt_poll_ns when interrupt arrives
> and shrinks when idle VCPU is detected. The perfomance looks good in v4.

Why did this patch have a worse idle overhead than always poll?

>
> Regards,
> Wanpeng Li
>
>
>>
>>>    halt_poll_ns_shrink/ |
>>>    halt_poll_ns_grow    | grow halt_poll_ns    | shrink halt_poll_ns
>>>    ---------------------+----------------------+-------------------
>>>    < 1                  |  = halt_poll_ns      |  = 0
>>>    < halt_poll_ns       | *= halt_poll_ns_grow | /= halt_poll_ns_shrink
>>>    otherwise            | += halt_poll_ns_grow | -= halt_poll_ns_shrink
>>
>> I was curious why you went with this approach rather than just the
>> middle row, or just the last row. Do you think we'll want the extra
>> flexibility?
>>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
>>> ---
>>>   virt/kvm/kvm_main.c | 65
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 64 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 93db833..2a4962b 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -66,9 +66,26 @@
>>>   MODULE_AUTHOR("Qumranet");
>>>   MODULE_LICENSE("GPL");
>>>
>>> -static unsigned int halt_poll_ns;
>>> +#define KVM_HALT_POLL_NS  500000
>>> +#define KVM_HALT_POLL_NS_GROW   2
>>> +#define KVM_HALT_POLL_NS_SHRINK 0
>>> +#define KVM_HALT_POLL_NS_MAX 2000000
>>
>> The macros are not necessary. Also, hard coding the numbers in the param
>> definitions will make reading the comments above them easier.
>>
>>> +
>>> +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS;
>>>   module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);
>>>
>>> +/* Default doubles per-vcpu halt_poll_ns. */
>>> +static unsigned int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW;
>>> +module_param(halt_poll_ns_grow, int, S_IRUGO);
>>> +
>>> +/* Default resets per-vcpu halt_poll_ns . */
>>> +static unsigned int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK;
>>> +module_param(halt_poll_ns_shrink, int, S_IRUGO);
>>> +
>>> +/* halt polling only reduces halt latency by 10-15 us, 2ms is enough */
>>
>> Ah, I misspoke before. I was thinking about round-trip latency. The
>> latency
>> of a single halt is reduced by about 5-7 us.
>>
>>> +static unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX;
>>> +module_param(halt_poll_ns_max, int, S_IRUGO);
>>
>> We can remove halt_poll_ns_max. vcpu->halt_poll_ns can always start at
>> zero
>> and grow from there. Then we just need one module param to keep
>> vcpu->halt_poll_ns from growing too large.
>>
>> [ It would make more sense to remove halt_poll_ns and keep
>> halt_poll_ns_max,
>>    but since halt_poll_ns already exists in upstream kernels, we probably
>> can't
>>    remove it. ]
>>
>>> +
>>>   /*
>>>    * Ordering of locks:
>>>    *
>>> @@ -1907,6 +1924,48 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu
>>> *vcpu, gfn_t gfn)
>>>   }
>>>   EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
>>>
>>> +static unsigned int __grow_halt_poll_ns(unsigned int val)
>>> +{
>>> +       if (halt_poll_ns_grow < 1)
>>> +               return halt_poll_ns;
>>> +
>>> +       val = min(val, halt_poll_ns_max);
>>> +
>>> +       if (val == 0)
>>> +               return halt_poll_ns;
>>> +
>>> +       if (halt_poll_ns_grow < halt_poll_ns)
>>> +               val *= halt_poll_ns_grow;
>>> +       else
>>> +               val += halt_poll_ns_grow;
>>> +
>>> +       return val;
>>> +}
>>> +
>>> +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int
>>> minimum)
>>
>> minimum never gets used.
>>
>>> +{
>>> +       if (modifier < 1)
>>> +               return 0;
>>> +
>>> +       if (modifier < halt_poll_ns)
>>> +               val /= modifier;
>>> +       else
>>> +               val -= modifier;
>>> +
>>> +       return val;
>>> +}
>>> +
>>> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
>>
>> These wrappers aren't necessary.
>>
>>> +{
>>> +       vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns);
>>> +}
>>> +
>>> +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)
>>> +{
>>> +       vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns,
>>> +                       halt_poll_ns_shrink, halt_poll_ns);
>>> +}
>>> +
>>>   static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>>>   {
>>>          if (kvm_arch_vcpu_runnable(vcpu)) {
>>> @@ -1954,6 +2013,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>>                          break;
>>>
>>>                  waited = true;
>>> +               if (vcpu->halt_poll_ns > halt_poll_ns_max)
>>> +                       shrink_halt_poll_ns(vcpu);
>>> +               else
>>> +                       grow_halt_poll_ns(vcpu);
>>
>> Shouldn't this go after the loop, and before "out:", in case we schedule
>> more than once? You can gate it on "if (waited)" so it only runs if we
>> actually scheduled.
>>
>>>                  schedule();
>>>          }
>>>
>>> --
>>> 1.9.1
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/