[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiBc13qU6P3OBn7w@google.com>
Date: Wed, 17 Apr 2024 16:35:51 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Kai Huang <kai.huang@...el.com>
Cc: Tina Zhang <tina.zhang@...el.com>, Hang Yuan <hang.yuan@...el.com>,
Bo2 Chen <chen.bo@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Erdem Aktas <erdemaktas@...gle.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
Isaku Yamahata <isaku.yamahata@...el.com>,
"isaku.yamahata@...ux.intel.com" <isaku.yamahata@...ux.intel.com>
Subject: Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when
loading the KVM intel kernel module
On Thu, Apr 18, 2024, Kai Huang wrote:
> On 18/04/2024 2:40 am, Sean Christopherson wrote:
> > This way, architectures that aren't saddled with out-of-tree hypervisors can do
> > the dead simple thing of enabling hardware during their initialization sequence,
> > and the TDX code is much more sane, e.g. invoke kvm_x86_enable_virtualization()
> > during late_hardware_setup(), and kvm_x86_disable_virtualization() during module
> > exit (presumably).
>
> Fine to me, given I am not familiar with other ARCHs, assuming always enable
> virtualization when KVM present is fine to them. :-)
>
> Two questions below:
>
> > +int kvm_x86_enable_virtualization(void)
> > +{
> > + int r;
> > +
> > + guard(mutex)(&vendor_module_lock);
>
> It's a little bit odd to take the vendor_module_lock mutex.
>
> It is called by kvm_arch_init_vm(), so more reasonablly we should still use
> kvm_lock?
I think this should take an x86-specific lock, since it's guarding x86-specific
data. And vendor_module_lock fits the bill perfectly. Well, except for the
name, and I definitely have no objection to renaming it.
> Also, if we invoke kvm_x86_enable_virtualization() from
> kvm_x86_ops->late_hardware_setup(), then IIUC we will deadlock here because
> kvm_x86_vendor_init() already takes the vendor_module_lock?
Ah, yeah. Oh, duh. I think the reason I didn't initially suggest late_hardware_setup()
is that I was assuming/hoping TDX setup could be done after kvm_x86_vendor_exit().
E.g. in vt_init() or whatever it gets called:
r = kvm_x86_vendor_exit(...);
if (r)
return r;
if (enable_tdx) {
r = tdx_blah_blah_blah();
if (r)
goto vendor_exit;
}
> > + if (kvm_usage_count++)
> > + return 0;
> > +
> > + r = kvm_enable_virtualization();
> > + if (r)
> > + --kvm_usage_count;
> > +
> > + return r;
> > +}
> > +EXPORT_SYMBOL_GPL(kvm_x86_enable_virtualization);
> > +
>
> [...]
>
> > +int kvm_enable_virtualization(void)
> > {
> > + int r;
> > +
> > + r = cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
> > + kvm_online_cpu, kvm_offline_cpu);
> > + if (r)
> > + return r;
> > +
> > + register_syscore_ops(&kvm_syscore_ops);
> > +
> > + /*
> > + * Manually undo virtualization enabling if the system is going down.
> > + * If userspace initiated a forced reboot, e.g. reboot -f, then it's
> > + * possible for an in-flight module load to enable virtualization
> > + * after syscore_shutdown() is called, i.e. without kvm_shutdown()
> > + * being invoked. Note, this relies on system_state being set _before_
> > + * kvm_shutdown(), e.g. to ensure either kvm_shutdown() is invoked
> > + * or this CPU observes the impedning shutdown. Which is why KVM uses
> > + * a syscore ops hook instead of registering a dedicated reboot
> > + * notifier (the latter runs before system_state is updated).
> > + */
> > + if (system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF ||
> > + system_state == SYSTEM_RESTART) {
> > + unregister_syscore_ops(&kvm_syscore_ops);
> > + cpuhp_remove_state(CPUHP_AP_KVM_ONLINE);
> > + return -EBUSY;
> > + }
> > +
>
> Aren't we also supposed to do:
>
> on_each_cpu(__kvm_enable_virtualization, NULL, 1);
>
> here?
No, cpuhp_setup_state() invokes the callback, kvm_online_cpu(), on each CPU.
I.e. KVM has been doing things the hard way by using cpuhp_setup_state_nocalls().
That's part of the complexity I would like to get rid of.
Powered by blists - more mailing lists