linux-kernel - Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)

Open Source and information security mailing list archives

Message-ID: <Y3UboELxugwDJkIG@google.com>
Date:   Wed, 16 Nov 2022 09:19:28 -0800
From:   David Matlack <dmatlack@...gle.com>
To:     "wangyanan (Y)" <wangyanan55@...wei.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>, kvm <kvm@...r.kernel.org>,
        David Hildenbrand <david@...hat.com>,
        "Kernel Mailing List, Linux" <linux-kernel@...r.kernel.org>,
        Paul Mackerras <paulus@...abs.org>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        KVM ARM <kvmarm@...ts.cs.columbia.edu>,
        Janosch Frank <frankja@...ux.ibm.com>,
        Marc Zyngier <maz@...nel.org>, Joerg Roedel <joro@...tes.org>,
        Huacai Chen <chenhuacai@...nel.org>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
        Jon Cargille <jcargill@...gle.com>,
        kvm-ppc <kvm-ppc@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Jim Mattson <jmattson@...gle.com>,
        Cornelia Huck <cohuck@...hat.com>,
        "open list:MIPS" <linux-mips@...r.kernel.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM:
 Halt-polling fixes, cleanups and a new stat)

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@...ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@...ts.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan


lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives