[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2383a1e9-ba2b-470f-8807-5f5f2528c7ad@intel.com>
Date: Thu, 18 Apr 2024 11:09:41 +1200
From: "Huang, Kai" <kai.huang@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: "Zhang, Tina" <tina.zhang@...el.com>, "Yuan, Hang" <hang.yuan@...el.com>,
"Chen, Bo2" <chen.bo@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Aktas, Erdem"
<erdemaktas@...gle.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "Yamahata, Isaku"
<isaku.yamahata@...el.com>, "isaku.yamahata@...ux.intel.com"
<isaku.yamahata@...ux.intel.com>
Subject: Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when
loading the KVM intel kernel module
On 18/04/2024 2:40 am, Sean Christopherson wrote:
> On Wed, Apr 17, 2024, Kai Huang wrote:
>> On Tue, 2024-04-16 at 13:58 -0700, Sean Christopherson wrote:
>>> On Fri, Apr 12, 2024, Kai Huang wrote:
>>>> On 12/04/2024 2:03 am, Sean Christopherson wrote:
>>>>> On Thu, Apr 11, 2024, Kai Huang wrote:
>>>>>> I can certainly follow up with this and generate a reviewable patchset if I
>>>>>> can confirm with you that this is what you want?
>>>>>
>>>>> Yes, I think it's the right direction. I still have minor concerns about VMX
>>>>> being enabled while kvm.ko is loaded, which means that VMXON will _always_ be
>>>>> enabled if KVM is built-in. But after seeing the complexity that is needed to
>>>>> safely initialize TDX, and after seeing just how much complexity KVM already
>>>>> has because it enables VMX on-demand (I hadn't actually tried removing that code
>>>>> before), I think the cost of that complexity far outweighs the risk of "always"
>>>>> being post-VMXON.
>>>>
>>>> Does always leaving VMXON have any actual damage, given we have emergency
>>>> virtualization shutdown?
>>>
>>> Being post-VMXON increases the risk of kexec() into the kdump kernel failing.
>>> The tradeoffs that we're trying to balance are: is the risk of kexec() failing
>>> due to the complexity of the emergency VMX code higher than the risk of us breaking
>>> things in general due to taking on a ton of complexity to juggle VMXON for TDX?
>>>
>>> After seeing the latest round of TDX code, my opinion is that being post-VMXON
>>> is less risky overall, in no small part because we need that to work anyways for
>>> hosts that are actively running VMs.
>>
>> How about we only keep VMX always on when TDX is enabled?
>
> Paolo also suggested that forcing VMXON only if TDX is enabled, mostly because
> kvm-intel.ko and kvm-amd.ko may be auto-loaded based on MODULE_DEVICE_TABLE(),
> which in turn causes problems for out-of-tree hypervisors that want control over
> VMX and SVM.
>
> I'm not opposed to the idea, it's the complexity and messiness I dislike. E.g.
> the TDX code shouldn't have to deal with CPU hotplug locks, core KVM shouldn't
> need to expose nolock helpers, etc. And if we're going to make non-trivial
> changes to the core KVM hardware enabling code anyways...
>
> What about this? Same basic idea as before, but instead of unconditionally doing
> hardware enabling during module initialization, let TDX do hardware enabling in
> a late_hardware_setup(), and then have KVM x86 ensure virtualization is enabled
> when creating VMs.
>
> This way, architectures that aren't saddled with out-of-tree hypervisors can do
> the dead simple thing of enabling hardware during their initialization sequence,
> and the TDX code is much more sane, e.g. invoke kvm_x86_enable_virtualization()
> during late_hardware_setup(), and kvm_x86_disable_virtualization() during module
> exit (presumably).
Fine to me, given I am not familiar with other ARCHs, assuming always
enable virtualization when KVM present is fine to them. :-)
Two questions below:
> +int kvm_x86_enable_virtualization(void)
> +{
> + int r;
> +
> + guard(mutex)(&vendor_module_lock);
It's a little bit odd to take the vendor_module_lock mutex.
It is called by kvm_arch_init_vm(), so more reasonablly we should still
use kvm_lock?
Also, if we invoke kvm_x86_enable_virtualization() from
kvm_x86_ops->late_hardware_setup(), then IIUC we will deadlock here
because kvm_x86_vendor_init() already takes the vendor_module_lock?
> +
> + if (kvm_usage_count++)
> + return 0;
> +
> + r = kvm_enable_virtualization();
> + if (r)
> + --kvm_usage_count;
> +
> + return r;
> +}
> +EXPORT_SYMBOL_GPL(kvm_x86_enable_virtualization);
> +
[...]
> +int kvm_enable_virtualization(void)
> {
> + int r;
> +
> + r = cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
> + kvm_online_cpu, kvm_offline_cpu);
> + if (r)
> + return r;
> +
> + register_syscore_ops(&kvm_syscore_ops);
> +
> + /*
> + * Manually undo virtualization enabling if the system is going down.
> + * If userspace initiated a forced reboot, e.g. reboot -f, then it's
> + * possible for an in-flight module load to enable virtualization
> + * after syscore_shutdown() is called, i.e. without kvm_shutdown()
> + * being invoked. Note, this relies on system_state being set _before_
> + * kvm_shutdown(), e.g. to ensure either kvm_shutdown() is invoked
> + * or this CPU observes the impedning shutdown. Which is why KVM uses
> + * a syscore ops hook instead of registering a dedicated reboot
> + * notifier (the latter runs before system_state is updated).
> + */
> + if (system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF ||
> + system_state == SYSTEM_RESTART) {
> + unregister_syscore_ops(&kvm_syscore_ops);
> + cpuhp_remove_state(CPUHP_AP_KVM_ONLINE);
> + return -EBUSY;
> + }
> +
Aren't we also supposed to do:
on_each_cpu(__kvm_enable_virtualization, NULL, 1);
here?
> return 0;
> }
>
Powered by blists - more mailing lists