lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z2IDkWPz2rhDLD0P@google.com>
Date: Tue, 17 Dec 2024 15:04:49 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: weizijie <zijie.wei@...ux.alibaba.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H . Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	xuyun <xuyun_xy.xy@...ux.alibaba.com>
Subject: Re: [PATCH] KVM: x86: ioapic: Optimize EOI handling to reduce
 unnecessary VM exits

On Thu, Nov 21, 2024, weizijie wrote:
> Address performance issues caused by a vector being reused by a
> non-IOAPIC source.
> 
> commit 0fc5a36dd6b3
> ("KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race")
> addressed the issues related to EOI and IOAPIC reconfiguration races.
> However, it has introduced some performance concerns:
> 
> Configuring IOAPIC interrupts while an interrupt request (IRQ) is
> already in service can unintentionally trigger a VM exit for other
> interrupts that normally do not require one, due to the settings of
> `ioapic_handled_vectors`. If the IOAPIC is not reconfigured during
> runtime, this issue persists, continuing to adversely affect
> performance.
> 
> Simple Fix Proposal:
> A straightforward solution is to record the vector that is pending at
> the time of injection. Then, upon the next guest exit, clean up the
> ioapic_handled_vectors corresponding to the vector number that was
> pending. This ensures that interrupts are properly handled and prevents
> performance issues.
> 
> Signed-off-by: weizijie <zijie.wei@...ux.alibaba.com>
> Signed-off-by: xuyun <xuyun_xy.xy@...ux.alibaba.com>

Your SoB should be last, and assuming Xuyun is a co-author, they need to be
credited via Co-developed-by.  See Documentation/process/submitting-patches.rst
for details.

> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/ioapic.c           | 11 +++++++++--
>  arch/x86/kvm/vmx/vmx.c          | 10 ++++++++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e159e44a6a1b..b008c933d2ab 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1041,6 +1041,7 @@ struct kvm_vcpu_arch {
>  #if IS_ENABLED(CONFIG_HYPERV)
>  	hpa_t hv_root_tdp;
>  #endif
> +	DECLARE_BITMAP(ioapic_pending_vectors, 256);
>  };
>  
>  struct kvm_lpage_info {
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 995eb5054360..6f5a88dc63da 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -284,6 +284,8 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, ulong *ioapic_handled_vectors)

The split IRQ chip mode should have the same enhancement.

>  	spin_lock(&ioapic->lock);
>  
> +	bitmap_zero(vcpu->arch.ioapic_pending_vectors, 256);

Rather than use a bitmap, what does performance look like if this is a single u8
that tracks the highest in-service IRQ at the time of the last scan?  And then
when that IRQ is EOI'd, do a full KVM_REQ_SCAN_IOAPIC instead of
KVM_REQ_LOAD_EOI_EXITMAP?  Having multiple interrupts in-flight at the time of
scan should be quite rare.

I like the idea, but burning 32 bytes for an edge case of an edge case seems
unnecessary.
 
> +
>  	/* Make sure we see any missing RTC EOI */
>  	if (test_bit(vcpu->vcpu_id, dest_map->map))
>  		__set_bit(dest_map->vectors[vcpu->vcpu_id],
> @@ -297,10 +299,15 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, ulong *ioapic_handled_vectors)
>  			u16 dm = kvm_lapic_irq_dest_mode(!!e->fields.dest_mode);
>  
>  			if (kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
> -						e->fields.dest_id, dm) ||
> -			    kvm_apic_pending_eoi(vcpu, e->fields.vector))
> +						e->fields.dest_id, dm))
> +				__set_bit(e->fields.vector,
> +					  ioapic_handled_vectors);
> +			else if (kvm_apic_pending_eoi(vcpu, e->fields.vector)) {
>  				__set_bit(e->fields.vector,
>  					  ioapic_handled_vectors);
> +				__set_bit(e->fields.vector,
> +					  vcpu->arch.ioapic_pending_vectors);
> +			}
>  		}
>  	}
>  	spin_unlock(&ioapic->lock);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 0f008f5ef6f0..572e6f9b8602 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5710,6 +5710,16 @@ static int handle_apic_eoi_induced(struct kvm_vcpu *vcpu)
>  
>  	/* EOI-induced VM exit is trap-like and thus no need to adjust IP */
>  	kvm_apic_set_eoi_accelerated(vcpu, vector);
> +
> +	/* When there are instances where ioapic_handled_vectors is
> +	 * set due to pending interrupts, clean up the record and the
> +	 * corresponding bit after the interrupt is completed.
> +	 */
> +	if (test_bit(vector, vcpu->arch.ioapic_pending_vectors)) {

This belongs in common code, probably kvm_ioapic_send_eoi().

> +		clear_bit(vector, vcpu->arch.ioapic_pending_vectors);
> +		clear_bit(vector, vcpu->arch.ioapic_handled_vectors);
> +		kvm_make_request(KVM_REQ_LOAD_EOI_EXITMAP, vcpu);
> +	}
>  	return 1;
>  }
>  
> -- 
> 2.43.5
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ