linux-kernel - Re: [PATCH] KVM: SVM: Don't intercept IRET when injecting NMI and vNMI is enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e348e75dac85efce9186b6b10a6da1c6532a3378.camel@redhat.com>
Date:   Tue, 10 Oct 2023 15:03:11 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Santosh Shukla <santosh.shukla@....com>
Subject: Re: [PATCH] KVM: SVM: Don't intercept IRET when injecting NMI and
 vNMI is enabled

У пн, 2023-10-09 у 14:29 -0700, Sean Christopherson пише:
> When vNMI is enabled, rely entirely on hardware to correctly handle NMI
> blocking, i.e. don't intercept IRET to detect when NMIs are no longer
> blocked.  KVM already correctly ignores svm->nmi_masked when vNMI is
> enabled, so the effect of the bug is essentially an unnecessary VM-Exit.

I would re-phrase this like that:

KVM intercepts IRET for two reasons:
- To track NMI masking to be able to know at any point of time if NMI is masked.
- To track NMI window (to inject another NMI after IRET finishes executing).

When L1 uses vNMI, both cases are fulfilled by the vNMI hardware:
- NMI masking state resides in V_NMI_BLOCKING bit of int_ctl and can be read by KVM
  at will.
- vNMI hardware injects the NMIs autonomically every time NMI is unblocked.

Thus there is no need to intercept IRET while vNMI is active.

However, even when vNMI is active in L1, the svm_inject_nmi() can still 
be called to do a direct NMI injection to support the case when KVM is 
trying to inject two NMIs simultaneously.

In this case there is no need to enable IRET interception.

Note that the effect of this bug is essentially an unnecessary VM-Exit.

Also note that even when vNMI is supported and used, running a nested guest
disables vNMI of the L1 guest, thus IRET will still be intercepted.
In this case if the nested VM exit happens before the NMI is delivered,
an unnecessary VM exit can still happen but this is even less likely.

> 
> Note, per the APM, hardware sets the BLOCKING flag when software directly
> directly injects an NMI:
> 
>   If Event Injection is used to inject an NMI when NMI Virtualization is
>   enabled, VMRUN sets V_NMI_MASK in the guest state.

I think that this comment is not needed in the commit message. It describes
a different unrelated concern and can be put somewhere in the code but
not in the commit message.

> 
> Fixes: fa4c027a7956 ("KVM: x86: Add support for SVM's Virtual NMI")
> Link: https://lore.kernel.org/all/ZOdnuDZUd4mevCqe@google.como
> Cc: Santosh Shukla <santosh.shukla@....com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> 
> Santosh, can you verify that I didn't break vNMI?  I don't have access to the
> right hardware.  Thanks!
> 
>  arch/x86/kvm/svm/svm.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b7472ad183b9..4f22d12b5d60 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3569,8 +3569,15 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>  	if (svm->nmi_l1_to_l2)
>  		return;
>  
> -	svm->nmi_masked = true;
> -	svm_set_iret_intercept(svm);
> +	/*
> +	 * No need to manually track NMI masking when vNMI is enabled, hardware
> +	 * automatically sets V_NMI_BLOCKING_MASK as appropriate, including the
> +	 * case where software directly injects an NMI.
> +	 */
> +	if (!is_vnmi_enabled(svm)) {
> +		svm->nmi_masked = true;
> +		svm_set_iret_intercept(svm);
> +	}
>  	++vcpu->stat.nmi_injections;
>  }
>  
> 
> base-commit: 86701e115030e020a052216baa942e8547e0b487

Note that while nested, the 'is_vnmi_enabled()' will return false because L1's vnmi is indeed disabled
(I wonder if is_vnmi_enabled should be renamed is_l1_vnmi_enabled() to clarify this),

So when nested VM exit happens, that intercept can still continue to be true, 
which should not cause an issue but this is still something to keep in mind.

Best regards,
	Maxim Levitsky