[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPJ4d3frVpRA7WKG@google.com>
Date: Fri, 17 Oct 2025 10:10:15 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Chao Gao <chao.gao@...el.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"Kirill A. Shutemov" <kas@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
linux-coco@...ts.linux.dev, kvm@...r.kernel.org,
Dan Williams <dan.j.williams@...el.com>, Xin Li <xin@...or.com>,
Kai Huang <kai.huang@...el.com>, Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [RFC PATCH 2/4] KVM: x86: Extract VMXON and EFER.SVME enablement
to kernel
On Fri, Oct 17, 2025, Chao Gao wrote:
> > void vmx_emergency_disable_virtualization_cpu(void)
> > {
> > int cpu = raw_smp_processor_id();
> > struct loaded_vmcs *v;
> >
> >- kvm_rebooting = true;
> >-
> >- /*
> >- * Note, CR4.VMXE can be _cleared_ in NMI context, but it can only be
> >- * set in task context. If this races with VMX is disabled by an NMI,
> >- * VMCLEAR and VMXOFF may #UD, but KVM will eat those faults due to
> >- * kvm_rebooting set.
> >- */
> >- if (!(__read_cr4() & X86_CR4_VMXE))
> >- return;
> >+ WARN_ON_ONCE(!virt_rebooting);
> >+ virt_rebooting = true;
>
> This is unnecessary as virt_rebooting has been set to true ...
>
> >+static void x86_vmx_emergency_disable_virtualization_cpu(void)
> >+{
> >+ virt_rebooting = true;
>
> ... here.
>
> and ditto for SVM.
Yeah, I wasn't sure what to do. I agree it's redundant, but it's harmless,
whereas not having virt_rebooting set would be Very Bad (TM). I think you're
probably right, and we should just assume we aren't terrible at programming.
Setting the flag in KVM could even hide latent bugs, e.g. if code runs before
x86_virt_invoke_kvm_emergency_callback().
> >+ /*
> >+ * Note, CR4.VMXE can be _cleared_ in NMI context, but it can only be
> >+ * set in task context. If this races with VMX being disabled via NMI,
> >+ * VMCLEAR and VMXOFF may #UD, but the kernel will eat those faults due
> >+ * to virt_rebooting being set.
> >+ */
> >+ if (!(__read_cr4() & X86_CR4_VMXE))
> >+ return;
> >+
> >+ x86_virt_invoke_kvm_emergency_callback();
> >+
> >+ x86_vmx_cpu_vmxoff();
> >+}
> >+
>
> <snip>
>
> >+void x86_virt_put_cpu(int feat)
> >+{
> >+ if (WARN_ON_ONCE(!this_cpu_read(virtualization_nr_users)))
> >+ return;
> >+
> >+ if (this_cpu_dec_return(virtualization_nr_users) && !virt_rebooting)
> >+ return;
>
> any reason to check virt_rebooting here?
>
> It seems unnecessary because both the emergency reboot case and shutdown case
> work fine without it, and keeping it might prevent us from discovering real
> bugs, e.g., KVM or TDX failing to decrease the refcount.
*sigh*
I simply misread my own code (and I suspect I pivoted on what I was doing). I
just spent ~10 minutes typing up various responses about how the emergency code
needs to _force_ VMX/SVM off, but I kept overlooking the fact that the emergency
hooks bypass the refcounting (which is obviously very intentional). /facepalm
So yeah, I agree that exempting the refcount on virt_rebooting is bad here.
E.g. if kvm_shutdown() runs before tdx_shutdown(), then KVM will pull the rug
out from under TDX, and hw/virt.c will attempt to disable virtualization twice.
Which is "fine" thanks to the hardening, but gross and unnecessary.
Thanks so much!
> >+
> >+ if (x86_virt_is_vmx() && feat == X86_FEATURE_VMX)
> >+ x86_vmx_put_cpu();
> >+ else if (x86_virt_is_svm() && feat == X86_FEATURE_SVM)
> >+ x86_svm_put_cpu();
> >+ else
> >+ WARN_ON_ONCE(1);
> >+}
> >+EXPORT_SYMBOL_GPL(x86_virt_put_cpu);
Powered by blists - more mailing lists