[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aUVx20ZRjOzKgKqy@google.com>
Date: Fri, 19 Dec 2025 07:40:11 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>
Cc: Chao Gao <chao.gao@...el.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
Kiryl Shutsemau <kas@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
linux-coco@...ts.linux.dev, kvm@...r.kernel.org,
Dan Williams <dan.j.williams@...el.com>
Subject: Re: [PATCH v2 2/7] KVM: x86: Extract VMXON and EFER.SVME enablement
to kernel
On Fri, Dec 19, 2025, Xu Yilun wrote:
> On Wed, Dec 17, 2025 at 11:01:59AM -0800, Sean Christopherson wrote:
> > On Wed, Dec 17, 2025, Xu Yilun wrote:
> > > Is it better we explicitly assert the preemption for x86_virt_get_cpu()
> > > rather than embed the check in __this_cpu_inc_return()? We are not just
> > > protecting the racing for the reference counter. We should ensure the
> > > "counter increase + x86_virt_call(get_cpu)" can't be preempted.
> >
> > I don't have a strong preference. Using __this_cpu_inc_return() without any
> > nearby preemption_{enable,disable}() calls makes it quite clears that preemption
> > is expected to be disabled by the caller. But I'm also ok being explicit.
>
> Looking into __this_cpu_inc_return(), it finally calls
> check_preemption_disabled() which doesn't strictly requires preemption.
> It only ensures the context doesn't switch to another CPU. If the caller
> is in cpuhp context, preemption is possible.
Hmm, right, the cpuhp thread is is_percpu_thread(), and KVM's hooks aren't
considered atomic and so run with IRQs enabled. In practice, it's "fine", because
TDX also exclusively does x86_virt_get_cpu() from cpuhp, i.e. the two users are
mutually exclusive, but relying on that behavior is gross.
> But in this x86_virt_get_cpu(), we need to ensure preemption disabled,
> otherwise caller A increases counter but hasn't do actual VMXON yet and
> get preempted. Caller B opts in and get the wrong info that VMX is
> already on, and fails on following vmx operations.
>
> On a second thought, maybe we disable preemption inside
> x86_virt_get_cpu() to protect the counter-vmxon racing, this is pure
> internal thing for this kAPI.
Ya, that'd be my preference.
Kai, question for you (or anyone else that might know):
Is there any **need** for tdx_cpu_enable() and try_init_module_global() to run
with IRQs disabled? AFAICT, the lockdep_assert_irqs_disabled() checks added by
commit 6162b310bc21 ("x86/virt/tdx: Add skeleton to enable TDX on demand") were
purely because, _when the code was written_, KVM enabled virtualization via IPI
function calls.
But by the time the KVM code landed upstream in commit fcdbdf63431c ("KVM: VMX:
Initialize TDX during KVM module load"), that was no longer true, thanks to
commit 9a798b1337af ("KVM: Register cpuhp and syscore callbacks when enabling
hardware") setting the stage for doing everything from task context.
However, rather than update the kernel side, e.g. to drop the lockdep assertions
and related comments, commit 9a798b1337af instead did this:
local_irq_save(flags);
r = tdx_cpu_enable();
local_irq_restore(flags);
Somewhat frustratingly, I poked at this when the reworked code was first posted
(https://lore.kernel.org/all/ZyJOiPQnBz31qLZ7@google.com), but it just got swept
under the rug :-(
: > > + /* tdx_cpu_enable() must be called with IRQ disabled */
: >
: > I don't find this comment helpfu. If it explained _why_ tdx_cpu_enable() requires
: > IRQs to be disabled, then I'd feel differently, but as is, IMO it doesn't add value.
:
: I'll remove the comment.
:
: >
: > > + local_irq_save(flags);
: > > + r = tdx_cpu_enable();
: > > + local_irq_restore(flags);
Unless TDX _needs_ IRQs to be disabled, I would strongly prefer to drop that code
in prep patches so that it doesn't become even harder to disentagle the history
to figure out why tdx_online_cpu() disables IRQs:
static int tdx_online_cpu(unsigned int cpu)
{
int ret;
guard(irqsave)(); <=============== why is this here!?!?!
ret = x86_virt_get_cpu(X86_FEATURE_VMX);
if (ret)
return ret;
ret = tdx_cpu_enable();
if (ret)
x86_virt_put_cpu(X86_FEATURE_VMX);
return ret;
}
Side topic, KVM's change in behavior also means this comment is stale (though I
think it's worth keeping the assertion, but with a comment saying it's hardening
and paranoia, not a strick requirement).
/*
* IRQs must be disabled as virtualization is enabled in hardware via
* function call IPIs, i.e. IRQs need to be disabled to guarantee
* virtualization stays disabled.
*/
lockdep_assert_irqs_disabled();
Powered by blists - more mailing lists