[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e124456c-0691-460b-c471-00ac1d7b0254@huawei.com>
Date: Fri, 18 Nov 2022 10:29:40 +0800
From: "wangyanan (Y)" <wangyanan55@...wei.com>
To: David Matlack <dmatlack@...gle.com>
CC: Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>, kvm <kvm@...r.kernel.org>,
David Hildenbrand <david@...hat.com>,
"Kernel Mailing List, Linux" <linux-kernel@...r.kernel.org>,
Paul Mackerras <paulus@...abs.org>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
KVM ARM <kvmarm@...ts.cs.columbia.edu>,
Janosch Frank <frankja@...ux.ibm.com>,
Marc Zyngier <maz@...nel.org>, Joerg Roedel <joro@...tes.org>,
Huacai Chen <chenhuacai@...nel.org>,
Christian Borntraeger <borntraeger@...ibm.com>,
Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
Jon Cargille <jcargill@...gle.com>,
kvm-ppc <kvm-ppc@...r.kernel.org>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
Jim Mattson <jmattson@...gle.com>,
Cornelia Huck <cohuck@...hat.com>,
"open list:MIPS" <linux-mips@...r.kernel.org>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
<wangyuan38@...wei.com>
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM:
Halt-polling fixes, cleanups and a new stat)
On 2022/11/17 1:19, David Matlack wrote:
> On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
>> Hi Sean, Paolo,
>>
>> I recently also notice the behavior change of param halt_poll_ns.
>> Now it loses the ability to:
>> 1) dynamically disable halt polling for all the running VMs
>> by `echo 0 > /sys`
>> 2) dynamically adjust the halt polling interval for all the
>> running VMs by `echo * > /sys`
>>
>> While in our cases, we usually use above two abilities, and
>> KVM_CAP_HALT_POLL is not used yet.
> I think the right path forward is to make KVM_CAP_HALT_POLL a pure
> override of halt_poll_ns, and restore the pre-existing behavior of
> halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
> below.
Agree with this.
kvm.halt_poll_ns serves like a legacy method to control halt polling
globally. Once KVM_CAP_HALT_POLL is used for a VM, it should
hold 100% responsibility to control on the VM, including disabling
the polling. This strategy helps to keep the two mechanisms
decoupled.
> That will fix issues (1) and (2) above for any VM not using
> KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
> all changes to halt_poll_ns. If we truly need a mechanism for admins to
> disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
> separate module parameter for that. But IMO, any setup that is
> sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
> KVM_CAP_HALT_POLL to disable halt polling.
>
> If everyone is happy with this approach I can test and send a real patch
> to the mailing list.
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index e6e66c5e56f2..253ad055b6ad 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -788,6 +788,7 @@ struct kvm {
> struct srcu_struct srcu;
> struct srcu_struct irq_srcu;
> pid_t userspace_pid;
> + bool override_halt_poll_ns;
> unsigned int max_halt_poll_ns;
> u32 dirty_ring_size;
> bool vm_bugged;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 43bbe4fde078..479d0d0da0b5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
> goto out_err_no_arch_destroy_vm;
> }
>
> - kvm->max_halt_poll_ns = halt_poll_ns;
> -
> r = kvm_arch_init_vm(kvm, type);
> if (r)
> goto out_err_no_arch_destroy_vm;
> @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
> sigemptyset(¤t->real_blocked);
> }
>
> -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
> {
> unsigned int old, val, grow, grow_start;
>
> @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
> if (val < grow_start)
> val = grow_start;
>
> - if (val > vcpu->kvm->max_halt_poll_ns)
> - val = vcpu->kvm->max_halt_poll_ns;
> + if (val > max)
> + val = max;
>
> vcpu->halt_poll_ns = val;
> out:
> @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> {
> bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
> bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
> + unsigned int max_halt_poll_ns;
> ktime_t start, cur, poll_end;
> + struct kvm *kvm = vcpu->kvm;
> bool waited = false;
> u64 halt_ns;
>
> + if (kvm->override_halt_poll_ns)
> + max_halt_poll_ns = kvm->max_halt_poll_ns;
> + else
> + max_halt_poll_ns = READ_ONCE(halt_poll_ns);
> +
> start = cur = poll_end = ktime_get();
> if (do_halt_poll) {
> ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
> @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> if (halt_poll_allowed) {
> if (!vcpu_valid_wakeup(vcpu)) {
> shrink_halt_poll_ns(vcpu);
> - } else if (vcpu->kvm->max_halt_poll_ns) {
> + } else if (max_halt_poll_ns) {
> if (halt_ns <= vcpu->halt_poll_ns)
> ;
> /* we had a long block, shrink polling */
> - else if (vcpu->halt_poll_ns &&
> - halt_ns > vcpu->kvm->max_halt_poll_ns)
> + else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
> shrink_halt_poll_ns(vcpu);
> /* we had a short halt and our poll time is too small */
> - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> - halt_ns < vcpu->kvm->max_halt_poll_ns)
> - grow_halt_poll_ns(vcpu);
> + else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> + halt_ns < max_halt_poll_ns)
> + grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> } else {
> vcpu->halt_poll_ns = 0;
> }
> @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
> if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
> return -EINVAL;
>
> + kvm->override_halt_poll_ns = true;
> kvm->max_halt_poll_ns = cap->args[0];
> return 0;
> }
Looks sensible to me overall.
I will look at the RFC series, thanks for your quick response.
Yanan
.
>> On 2021/9/28 1:33, Sean Christopherson wrote:
>>> On Mon, Sep 27, 2021, Paolo Bonzini wrote:
>>>> On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
>>>> <borntraeger@...ibm.com> wrote:
>>>>>> So I think there are two possibilities that makes sense:
>>>>>>
>>>>>> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>>>>> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
>>>> Yes, that's what I meant. David pointed out that doesn't allow you to
>>>> disable halt polling altogether, but for that you can always ask each
>>>> VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
>>>> don't know about Google's usecase, but mine was actually more about
>>>> using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
>>> I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
>>> in-the-field mitigation if halt-polling is broken. It'd be trivial to support, e.g.
>> Do we have any plan to repost the diff as a fix?
>> I would be very nice that this issue can be solved.
>>
>> Besides, I think we may need some Doc for users to describe
>> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
>> "Documentation/virt/guest-halt-polling.rst".
>>> @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>>> update_halt_poll_stats(vcpu, start, poll_end, !waited);
>>>
>>> if (halt_poll_allowed) {
>>> + max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
>>> + if (!max_halt_poll_ns || !halt_poll_ns) <------ squish the max if halt_poll_ns==0
>>> + max_halt_poll_ns = halt_poll_ns;
>>> +
>> Does this mean that KVM_CAP_HALT_POLL will not be able to
>> disable halt polling for a VM individually when halt_poll_ns !=0?
>>> if (!vcpu_valid_wakeup(vcpu)) {
>>> shrink_halt_poll_ns(vcpu);
>>> - } else if (vcpu->kvm->max_halt_poll_ns) {
>>> + } else if (max_halt_poll_ns) {
>>> if (halt_ns <= vcpu->halt_poll_ns)
>>> ;
>>> /* we had a long block, shrink polling */
>>> else if (vcpu->halt_poll_ns &&
>>> - halt_ns > vcpu->kvm->max_halt_poll_ns)
>>> + halt_ns > max_halt_poll_ns)
>>> shrink_halt_poll_ns(vcpu);
>>> /* we had a short halt and our poll time is too small */
>>> - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
>>> - halt_ns < vcpu->kvm->max_halt_poll_ns)
>>> - grow_halt_poll_ns(vcpu);
>>> + else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
>>> + halt_ns < max_halt_poll_ns)
>>> + grow_halt_poll_ns(vcpu, max_halt_poll_ns);
>>> } else {
>>> vcpu->halt_poll_ns = 0;
>>> }
>>> _______________________________________________
>>> kvmarm mailing list
>>> kvmarm@...ts.cs.columbia.edu
>>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>>> .
>> Thanks,
>> Yanan
> .
Powered by blists - more mailing lists