lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b93133a9a868e21fe4fb854f9ed0fdf20519064.camel@redhat.com>
Date:   Wed, 06 Jul 2022 15:05:06 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Oliver Upton <oupton@...gle.com>,
        Peter Shier <pshier@...gle.com>
Subject: Re: [PATCH v2 15/21] KVM: x86: Hoist nested event checks above
 event injection logic

On Tue, 2022-06-14 at 20:47 +0000, Sean Christopherson wrote:
> Perform nested event checks before re-injecting exceptions/events into
> L2.  If a pending exception causes VM-Exit to L1, re-injecting events
> into vmcs02 is premature and wasted effort.  Take care to ensure events
> that need to be re-injected are still re-injected if checking for nested
> events "fails", i.e. if KVM needs to force an immediate entry+exit to
> complete the to-be-re-injecteed event.
> 
> Keep the "can_inject" logic the same for now; it too can be pushed below
> the nested checks, but is a slightly riskier change (see past bugs about
> events not being properly purged on nested VM-Exit).
> 
> Add and/or modify comments to better document the various interactions.
> Of note is the comment regarding "blocking" previously injected NMIs and
> IRQs if an exception is pending.  The old comment isn't wrong strictly
> speaking, but it failed to capture the reason why the logic even exists.
> 
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/x86.c | 89 +++++++++++++++++++++++++++-------------------
>  1 file changed, 53 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e45465075005..930de833aa2b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9502,53 +9502,70 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
>  
>  static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
>  {
> +	bool can_inject = !kvm_event_needs_reinjection(vcpu);
>  	int r;
> -	bool can_inject = true;
>  
> -	/* try to reinject previous events if any */
> +	/*
> +	 * Process nested events first, as nested VM-Exit supercedes event
> +	 * re-injection.  If there's an event queued for re-injection, it will
> +	 * be saved into the appropriate vmc{b,s}12 fields on nested VM-Exit.
> +	 */
> +	if (is_guest_mode(vcpu))
> +		r = kvm_check_nested_events(vcpu);
> +	else
> +		r = 0;

Makes sense a lot!

>  
> -	if (vcpu->arch.exception.injected) {
> +	/*
> +	 * Re-inject exceptions and events *especially* if immediate entry+exit
> +	 * to/from L2 is needed, as any event that has already been injected
> +	 * into L2 needs to complete its lifecycle before injecting a new event.
> +	 *
> +	 * Don't re-inject an NMI or interrupt if there is a pending exception.
> +	 * This collision arises if an exception occurred while vectoring the
> +	 * injected event, KVM intercepted said exception, and KVM ultimately
> +	 * determined the fault belongs to the guest and queues the exception
> +	 * for injection back into the guest.
> +	 *
> +	 * "Injected" interrupts can also collide with pending exceptions if
> +	 * userspace ignores the "ready for injection" flag and blindly queues
> +	 * an interrupt.  In that case, prioritizing the exception is correct,
> +	 * as the exception "occurred" before the exit to userspace.  Trap-like
> +	 * exceptions, e.g. most #DBs, have higher priority than interrupts.
> +	 * And while fault-like exceptions, e.g. #GP and #PF, are the lowest
> +	 * priority, they're only generated (pended) during instruction
> +	 * execution, and interrupts are recognized at instruction boundaries.
> +	 * Thus a pending fault-like exception means the fault occurred on the
> +	 * *previous* instruction and must be serviced prior to recognizing any
> +	 * new events in order to fully complete the previous instruction.
> +	 */
> +	if (vcpu->arch.exception.injected)
>  		kvm_inject_exception(vcpu);
> -		can_inject = false;
> -	}
> +	else if (vcpu->arch.exception.pending)
> +		; /* see above */
> +	else if (vcpu->arch.nmi_injected)
> +		static_call(kvm_x86_inject_nmi)(vcpu);
> +	else if (vcpu->arch.interrupt.injected)
> +		static_call(kvm_x86_inject_irq)(vcpu, true);
> +
>  	/*
> -	 * Do not inject an NMI or interrupt if there is a pending
> -	 * exception.  Exceptions and interrupts are recognized at
> -	 * instruction boundaries, i.e. the start of an instruction.
> -	 * Trap-like exceptions, e.g. #DB, have higher priority than
> -	 * NMIs and interrupts, i.e. traps are recognized before an
> -	 * NMI/interrupt that's pending on the same instruction.
> -	 * Fault-like exceptions, e.g. #GP and #PF, are the lowest
> -	 * priority, but are only generated (pended) during instruction
> -	 * execution, i.e. a pending fault-like exception means the
> -	 * fault occurred on the *previous* instruction and must be
> -	 * serviced prior to recognizing any new events in order to
> -	 * fully complete the previous instruction.
> +	 * Exceptions that morph to VM-Exits are handled above, and pending
> +	 * exceptions on top of injected exceptions that do not VM-Exit should
> +	 * either morph to #DF or, sadly, override the injected exception.
>  	 */
> -	else if (!vcpu->arch.exception.pending) {
> -		if (vcpu->arch.nmi_injected) {
> -			static_call(kvm_x86_inject_nmi)(vcpu);
> -			can_inject = false;
> -		} else if (vcpu->arch.interrupt.injected) {
> -			static_call(kvm_x86_inject_irq)(vcpu, true);
> -			can_inject = false;
> -		}
> -	}
> -
>  	WARN_ON_ONCE(vcpu->arch.exception.injected &&
>  		     vcpu->arch.exception.pending);
>  
>  	/*
> -	 * Call check_nested_events() even if we reinjected a previous event
> -	 * in order for caller to determine if it should require immediate-exit
> -	 * from L2 to L1 due to pending L1 events which require exit
> -	 * from L2 to L1.
> +	 * Bail if immediate entry+exit to/from the guest is needed to complete
> +	 * nested VM-Enter or event re-injection so that a different pending
> +	 * event can be serviced (or if KVM needs to exit to userspace).
> +	 *
> +	 * Otherwise, continue processing events even if VM-Exit occurred.  The
> +	 * VM-Exit will have cleared exceptions that were meant for L2, but
> +	 * there may now be events that can be injected into L1.
>  	 */
> -	if (is_guest_mode(vcpu)) {
> -		r = kvm_check_nested_events(vcpu);
> -		if (r < 0)
> -			goto out;
> -	}
> +	if (r < 0)
> +		goto out;
>  
>  	/* try to inject new event if pending */
>  	if (vcpu->arch.exception.pending) {

All makes sense AFAIK.

Reviewed-by: Maxim Levitsky <mlevitsk@...hat.com>

Best regards,
	Maxim Levitsky



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ