lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240718172631.GI1900928@ls.amr.corp.intel.com>
Date: Thu, 18 Jul 2024 10:26:31 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: Binbin Wu <binbin.wu@...ux.intel.com>
Cc: Isaku Yamahata <isaku.yamahata@...el.com>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, isaku.yamahata@...il.com,
	Paolo Bonzini <pbonzini@...hat.com>, erdemaktas@...gle.com,
	Sean Christopherson <seanjc@...gle.com>,
	Sagi Shahar <sagis@...gle.com>, Kai Huang <kai.huang@...el.com>,
	chen.bo@...el.com, hang.yuan@...el.com, tina.zhang@...el.com,
	rick.p.edgecombe@...el.com
Subject: Re: [PATCH v19 117/130] KVM: TDX: Silently ignore INIT/SIPI

On Thu, Jul 18, 2024 at 09:42:11AM +0800,
Binbin Wu <binbin.wu@...ux.intel.com> wrote:

> 
> 
> On 7/17/2024 6:55 AM, Isaku Yamahata wrote:
> > On Mon, Jun 17, 2024 at 09:20:36AM +0800,
> > Binbin Wu <binbin.wu@...ux.intel.com> wrote:
> > 
> > > 
> > > On 2/26/2024 4:26 PM, isaku.yamahata@...el.com wrote:
> > > > From: Isaku Yamahata <isaku.yamahata@...el.com>
> > > > 
> > > > The TDX module API doesn't provide API for VMM to inject INIT IPI and SIPI.
> > > > Instead it defines the different protocols to boot application processors.
> > > > Ignore INIT and SIPI events for the TDX guest.
> > > > 
> > > > There are two options. 1) (silently) ignore INIT/SIPI request or 2) return
> > > > error to guest TDs somehow.  Given that TDX guest is paravirtualized to
> > > > boot AP, the option 1 is chosen for simplicity.
> > > > 
> > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> > > > ---
> > > >    arch/x86/include/asm/kvm-x86-ops.h |  1 +
> > > >    arch/x86/include/asm/kvm_host.h    |  2 ++
> > > >    arch/x86/kvm/lapic.c               | 19 +++++++++++-------
> > > >    arch/x86/kvm/svm/svm.c             |  1 +
> > > >    arch/x86/kvm/vmx/main.c            | 32 ++++++++++++++++++++++++++++--
> > > >    arch/x86/kvm/vmx/tdx.c             |  4 ++--
> > > >    6 files changed, 48 insertions(+), 11 deletions(-)
> > > > 
> > > > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> > > > index 22d93d4124c8..85c04aad6ab3 100644
> > > > --- a/arch/x86/include/asm/kvm-x86-ops.h
> > > > +++ b/arch/x86/include/asm/kvm-x86-ops.h
> > > > @@ -149,6 +149,7 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
> > > >    KVM_X86_OP(msr_filter_changed)
> > > >    KVM_X86_OP(complete_emulated_msr)
> > > >    KVM_X86_OP(vcpu_deliver_sipi_vector)
> > > > +KVM_X86_OP(vcpu_deliver_init)
> > > >    KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> > > >    KVM_X86_OP_OPTIONAL(get_untagged_addr)
> > > >    KVM_X86_OP_OPTIONAL_RET0(gmem_max_level)
> > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > index bb8be091f996..2686c080820b 100644
> > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > @@ -1836,6 +1836,7 @@ struct kvm_x86_ops {
> > > >    	int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
> > > >    	void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
> > > > +	void (*vcpu_deliver_init)(struct kvm_vcpu *vcpu);
> > > >    	/*
> > > >    	 * Returns vCPU specific APICv inhibit reasons
> > > > @@ -2092,6 +2093,7 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
> > > >    void kvm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
> > > >    int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg);
> > > >    void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
> > > > +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu);
> > > >    int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
> > > >    		    int reason, bool has_error_code, u32 error_code);
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > index 8025c7f614e0..431074679e83 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -3268,6 +3268,16 @@ int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len)
> > > >    	return 0;
> > > >    }
> > > > +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +	kvm_vcpu_reset(vcpu, true);
> > > > +	if (kvm_vcpu_is_bsp(vcpu))
> > > > +		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> > > > +	else
> > > > +		vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_init);
> > > > +
> > > >    int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
> > > >    {
> > > >    	struct kvm_lapic *apic = vcpu->arch.apic;
> > > > @@ -3299,13 +3309,8 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
> > > >    		return 0;
> > > >    	}
> > > > -	if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) {
> > > > -		kvm_vcpu_reset(vcpu, true);
> > > > -		if (kvm_vcpu_is_bsp(apic->vcpu))
> > > > -			vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> > > > -		else
> > > > -			vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
> > > > -	}
> > > > +	if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events))
> > > > +		static_call(kvm_x86_vcpu_deliver_init)(vcpu);
> > > >    	if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events)) {
> > > >    		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> > > >    			/* evaluate pending_events before reading the vector */
> > > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > > index f76dd52d29ba..27546d993809 100644
> > > > --- a/arch/x86/kvm/svm/svm.c
> > > > +++ b/arch/x86/kvm/svm/svm.c
> > > > @@ -5037,6 +5037,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> > > >    	.complete_emulated_msr = svm_complete_emulated_msr,
> > > >    	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
> > > > +	.vcpu_deliver_init = kvm_vcpu_deliver_init,
> > > >    	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
> > > >    };
> > > > diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> > > > index 4f3b872cd401..84d2dc818cf7 100644
> > > > --- a/arch/x86/kvm/vmx/main.c
> > > > +++ b/arch/x86/kvm/vmx/main.c
> > > > @@ -320,6 +320,14 @@ static void vt_enable_smi_window(struct kvm_vcpu *vcpu)
> > > >    }
> > > >    #endif
> > > > +static bool vt_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +	if (is_td_vcpu(vcpu))
> > > > +		return true;
> > > Since for TD, INIT is always blocked, then in kvm_apic_accept_events(), the
> > > code path to handle INIT/SIPI delivery will not be called, i.e, the OPs
> > > .vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for
> > > TD.
> > > Seems no need to add the new interface  vcpu_deliver_init or the new wrapper
> > > vt_vcpu_deliver_sipi_vector().
> > > 
> > > And consider the INIT/SIPI for TD:
> > > - Normally, for TD, INIT ans SIPI should not be set in APIC's
> > > pending_events.
> > >    Maybe we can call KVM_BUG_ON() in vt_apic_init_signal_blocked() for TD?
> > > - If INIT and SIPI are allowed be set in APIC's pending_events for somehow,
> > > the current code has a problem, it will never clear INIT bit in APIC's
> > > pending_events.
> > >    Then kvm_apic_accept_events() needs to execute more check code if INIT was
> > > once set.
> > >    INIT bit should be cleared with this assumption.
> > 
> > KVM_SET_MP_STATE and KVM_SET_VCPU_EVENTS can set INIT/SIPI by the user space.
> > If we change those two IOCTLs to reject INIT/SIPI for TDX, we can go for the
> > above "Normally" option.  Because I didn't want to touch the common functions, I
> > ended in extra callbacks.  The user space shouldn't use those two IOCTLs for
> > TDX, though.
> 
> Even though INIT/SIPI can be set by userspace to APIC's pending_events,
> for TD, INIT is always blocked, then in kvm_apic_accept_events(), the
> code path to handle INIT/SIPI delivery will not be called, i.e, the OPs
> .vcpu_deliver_init() and .vcpu_deliver_sipi_vector() are never called for
> TD.
> 
> Can we just drop the new OPs .vcpu_deliver_init() and the new wrapper
> vt_vcpu_deliver_sipi_vector()?

Ah, you're right. We simply always block INIT/SIPI.  We should update the commit
message.  How about this?  Added option 3.

The TDX module API doesn't provide API for VMM to inject INIT IPI and SIPI.
Instead it defines the different protocols to boot application processors.
Ignore INIT and SIPI events for the TDX guest.

There are two options. 1) (silently) ignore INIT/SIPI request or 2) return
error to guest TDs somehow.  3) Always block INIT/SIPI always.
Given that TDX guest is paravirtualized to boot AP, the option 3
is chosen for simplicity.
-- 
Isaku Yamahata <isaku.yamahata@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ