[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB54338553E818F452D87D27D88C689@BN9PR11MB5433.namprd11.prod.outlook.com>
Date: Wed, 1 Dec 2021 07:18:21 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"Huang, Kai" <kai.huang@...el.com>,
"Nakajima, Jun" <jun.nakajima@...el.com>,
"Hansen, Dave" <dave.hansen@...el.com>,
"Gao, Chao" <chao.gao@...el.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: RE: Q. about KVM and CPU hotplug
> From: Paolo Bonzini <paolo.bonzini@...il.com>
> Sent: Wednesday, December 1, 2021 12:27 AM
>
> On 11/30/21 15:05, Thomas Gleixner wrote:
> > Why is this hotplug callback in the CPU starting section to begin with?
>
> Just because the old notifier implementation used CPU_STARTING - in fact
> the commit messages say that CPU_STARTING was added partly *for* KVM
> (commit e545a6140b69, "kernel/cpu.c: create a CPU_STARTING cpu_chain
> notifier", 2008-09-08).
>
> > If you stick it into the online section which runs on the hotplugged CPU
> > in thread context:
> >
> > CPUHP_AP_ONLINE_IDLE,
> >
> > --> CPUHP_AP_KVM_STARTING,
> >
> > CPUHP_AP_SCHED_WAIT_EMPTY,
> >
> > then it is allowed to fail and it still works in the right way.
>
> Yes, moving it to the online section should be fine; it wouldn't solve
> the TDX problem however. Failure would rollback the hotplug and forbid
> hotplug altogether when TDX is loaded, which is not acceptable.
>
Fail hotplug just because TDX is loaded is not acceptable.
But fail hotplug when a trusted domain using TDX is active imo makes
sense. It's similar philosophy to VMX which, with above change, will
fail hotplug when kvm_usage_count is non-zero (implying a VM is
active) but VMX initialization fails on this CPU. We can add similar
tdx_usage_count to mark active TDX users and forbid hotplug
when this variable is non-zero.
In general I think it's an acceptable policy to fail an operation if it
breaks active existing usages... 😊
Thanks
Kevin
Powered by blists - more mailing lists