[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d55d3c5a-41d9-46bd-91da-f07187501bfa@linux.intel.com>
Date: Thu, 18 Jul 2024 09:42:11 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Isaku Yamahata <isaku.yamahata@...el.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
isaku.yamahata@...il.com, Paolo Bonzini <pbonzini@...hat.com>,
erdemaktas@...gle.com, Sean Christopherson <seanjc@...gle.com>,
Sagi Shahar <sagis@...gle.com>, Kai Huang <kai.huang@...el.com>,
chen.bo@...el.com, hang.yuan@...el.com, tina.zhang@...el.com,
rick.p.edgecombe@...el.com
Subject: Re: [PATCH v19 117/130] KVM: TDX: Silently ignore INIT/SIPI
On 7/17/2024 6:55 AM, Isaku Yamahata wrote:
> On Mon, Jun 17, 2024 at 09:20:36AM +0800,
> Binbin Wu <binbin.wu@...ux.intel.com> wrote:
>
>>
>> On 2/26/2024 4:26 PM, isaku.yamahata@...el.com wrote:
>>> From: Isaku Yamahata <isaku.yamahata@...el.com>
>>>
>>> The TDX module API doesn't provide API for VMM to inject INIT IPI and SIPI.
>>> Instead it defines the different protocols to boot application processors.
>>> Ignore INIT and SIPI events for the TDX guest.
>>>
>>> There are two options. 1) (silently) ignore INIT/SIPI request or 2) return
>>> error to guest TDs somehow. Given that TDX guest is paravirtualized to
>>> boot AP, the option 1 is chosen for simplicity.
>>>
>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
>>> ---
>>> arch/x86/include/asm/kvm-x86-ops.h | 1 +
>>> arch/x86/include/asm/kvm_host.h | 2 ++
>>> arch/x86/kvm/lapic.c | 19 +++++++++++-------
>>> arch/x86/kvm/svm/svm.c | 1 +
>>> arch/x86/kvm/vmx/main.c | 32 ++++++++++++++++++++++++++++--
>>> arch/x86/kvm/vmx/tdx.c | 4 ++--
>>> 6 files changed, 48 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
>>> index 22d93d4124c8..85c04aad6ab3 100644
>>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>>> @@ -149,6 +149,7 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>>> KVM_X86_OP(msr_filter_changed)
>>> KVM_X86_OP(complete_emulated_msr)
>>> KVM_X86_OP(vcpu_deliver_sipi_vector)
>>> +KVM_X86_OP(vcpu_deliver_init)
>>> KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>>> KVM_X86_OP_OPTIONAL(get_untagged_addr)
>>> KVM_X86_OP_OPTIONAL_RET0(gmem_max_level)
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index bb8be091f996..2686c080820b 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -1836,6 +1836,7 @@ struct kvm_x86_ops {
>>> int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
>>> void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
>>> + void (*vcpu_deliver_init)(struct kvm_vcpu *vcpu);
>>> /*
>>> * Returns vCPU specific APICv inhibit reasons
>>> @@ -2092,6 +2093,7 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
>>> void kvm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
>>> int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg);
>>> void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>>> +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu);
>>> int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
>>> int reason, bool has_error_code, u32 error_code);
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 8025c7f614e0..431074679e83 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -3268,6 +3268,16 @@ int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len)
>>> return 0;
>>> }
>>> +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu)
>>> +{
>>> + kvm_vcpu_reset(vcpu, true);
>>> + if (kvm_vcpu_is_bsp(vcpu))
>>> + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>>> + else
>>> + vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
>>> +}
>>> +EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_init);
>>> +
>>> int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
>>> {
>>> struct kvm_lapic *apic = vcpu->arch.apic;
>>> @@ -3299,13 +3309,8 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
>>> return 0;
>>> }
>>> - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) {
>>> - kvm_vcpu_reset(vcpu, true);
>>> - if (kvm_vcpu_is_bsp(apic->vcpu))
>>> - vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>>> - else
>>> - vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
>>> - }
>>> + if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events))
>>> + static_call(kvm_x86_vcpu_deliver_init)(vcpu);
>>> if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events)) {
>>> if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
>>> /* evaluate pending_events before reading the vector */
>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>> index f76dd52d29ba..27546d993809 100644
>>> --- a/arch/x86/kvm/svm/svm.c
>>> +++ b/arch/x86/kvm/svm/svm.c
>>> @@ -5037,6 +5037,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>>> .complete_emulated_msr = svm_complete_emulated_msr,
>>> .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
>>> + .vcpu_deliver_init = kvm_vcpu_deliver_init,
>>> .vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
>>> };
>>> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
>>> index 4f3b872cd401..84d2dc818cf7 100644
>>> --- a/arch/x86/kvm/vmx/main.c
>>> +++ b/arch/x86/kvm/vmx/main.c
>>> @@ -320,6 +320,14 @@ static void vt_enable_smi_window(struct kvm_vcpu *vcpu)
>>> }
>>> #endif
>>> +static bool vt_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
>>> +{
>>> + if (is_td_vcpu(vcpu))
>>> + return true;
>> Since for TD, INIT is always blocked, then in kvm_apic_accept_events(), the
>> code path to handle INIT/SIPI delivery will not be called, i.e, the OPs
>> .vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for
>> TD.
>> Seems no need to add the new interface vcpu_deliver_init or the new wrapper
>> vt_vcpu_deliver_sipi_vector().
>>
>> And consider the INIT/SIPI for TD:
>> - Normally, for TD, INIT ans SIPI should not be set in APIC's
>> pending_events.
>> Maybe we can call KVM_BUG_ON() in vt_apic_init_signal_blocked() for TD?
>> - If INIT and SIPI are allowed be set in APIC's pending_events for somehow,
>> the current code has a problem, it will never clear INIT bit in APIC's
>> pending_events.
>> Then kvm_apic_accept_events() needs to execute more check code if INIT was
>> once set.
>> INIT bit should be cleared with this assumption.
>
> KVM_SET_MP_STATE and KVM_SET_VCPU_EVENTS can set INIT/SIPI by the user space.
> If we change those two IOCTLs to reject INIT/SIPI for TDX, we can go for the
> above "Normally" option. Because I didn't want to touch the common functions, I
> ended in extra callbacks. The user space shouldn't use those two IOCTLs for
> TDX, though.
Even though INIT/SIPI can be set by userspace to APIC's pending_events,
for TD, INIT is always blocked, then in kvm_apic_accept_events(), the
code path to handle INIT/SIPI delivery will not be called, i.e, the OPs
.vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for
TD.
Can we just drop the new OPs .vcpu_deliver_init() and the new wrapper
vt_vcpu_deliver_sipi_vector()?
Powered by blists - more mailing lists