linux-kernel - Re: [PATCH v2 07/11] KVM: x86: add a delayed hardware NMI injection interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3681058-224d-07c7-283f-5f81ab523844@amd.com>
Date:   Wed, 8 Feb 2023 15:21:59 +0530
From:   Santosh Shukla <santosh.shukla@....com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Maxim Levitsky <mlevitsk@...hat.com>
Cc:     kvm@...r.kernel.org, Sandipan Das <sandipan.das@....com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Borislav Petkov <bp@...en8.de>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Daniel Sneddon <daniel.sneddon@...ux.intel.com>,
        Jiaxi Chen <jiaxi.chen@...ux.intel.com>,
        Babu Moger <babu.moger@....com>, linux-kernel@...r.kernel.org,
        Jing Liu <jing2.liu@...el.com>,
        Wyes Karny <wyes.karny@....com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2 07/11] KVM: x86: add a delayed hardware NMI injection
 interface

On 2/1/2023 5:36 AM, Sean Christopherson wrote:
> On Tue, Jan 31, 2023, Sean Christopherson wrote:
>> On Tue, Nov 29, 2022, Maxim Levitsky wrote:
>>> @@ -10015,13 +10022,34 @@ static void process_nmi(struct kvm_vcpu *vcpu)
>>>  	 * Otherwise, allow two (and we'll inject the first one immediately).
>>>  	 */
>>>  	if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected)
>>> -		limit = 1;
>>> +		limit--;
>>> +
>>> +	/* Also if there is already a NMI hardware queued to be injected,
>>> +	 * decrease the limit again
>>> +	 */
>>> +	if (static_call(kvm_x86_get_hw_nmi_pending)(vcpu))
>>> +		limit--;
>>
>> I don't think this is correct.  If a vNMI is pending and NMIs are blocked, then
>> limit will end up '0' and KVM will fail to pend the additional NMI in software.
> 
> Scratch that, dropping the second NMI in this case is correct.  The "running" part
> of the existing "x86 is limited to one NMI running, and one NMI pending after it"
> confused me.  The "running" thing is really just a variant on NMIs being blocked.
> 
> I'd also like to avoid the double decrement logic.  Accounting the virtual NMI is
> a very different thing than dealing with concurrent NMIs, I'd prefer to reflect
> that in the code.
> 
> Any objection to folding in the below to end up with:
> 
> 	unsigned limit;
> 
> 	/*
> 	 * x86 is limited to one NMI pending, but because KVM can't react to
> 	 * incoming NMIs as quickly as bare metal, e.g. if the vCPU is
> 	 * scheduled out, KVM needs to play nice with two queued NMIs showing
> 	 * up at the same time.  To handle this scenario, allow two NMIs to be
> 	 * (temporarily) pending so long as NMIs are not blocked and KVM is not
> 	 * waiting for a previous NMI injection to complete (which effectively
> 	 * blocks NMIs).  KVM will immediately inject one of the two NMIs, and
> 	 * will request an NMI window to handle the second NMI.
> 	 */
> 	if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected)
> 		limit = 1;
> 	else
> 		limit = 2;
> 
> 	/*
> 	 * Adjust the limit to account for pending virtual NMIs, which aren't
> 	 * tracked in in vcpu->arch.nmi_pending.
> 	 */
> 	if (static_call(kvm_x86_is_vnmi_pending)(vcpu))
> 		limit--;
> 
> 	vcpu->arch.nmi_pending += atomic_xchg(&vcpu->arch.nmi_queued, 0);
> 	vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit);
> 

I believe, you missed the function below hunk -

	if (vcpu->arch.nmi_pending &&
	    static_call(kvm_x86_set_vnmi_pending(vcpu)))
		vcpu->arch.nmi_pending--;

Or am I missing something.. please suggest.

> 	if (vcpu->arch.nmi_pending)
> 		kvm_make_request(KVM_REQ_EVENT, vcpu);
> 
> --
> From: Sean Christopherson <seanjc@...gle.com>
> Date: Tue, 31 Jan 2023 16:02:21 -0800
> Subject: [PATCH] KVM: x86: Tweak the code and comment related to handling
>  concurrent NMIs
> 
> Tweak the code and comment that deals with concurrent NMIs to explicitly
> call out that x86 allows exactly one pending NMI, but that KVM needs to
> temporarily allow two pending NMIs in order to workaround the fact that
> the target vCPU cannot immediately recognize an incoming NMI, unlike bare
> metal.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/x86.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 030136b6ebbd..fda09ba48b6b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10122,15 +10122,22 @@ static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu,
>  
>  static void process_nmi(struct kvm_vcpu *vcpu)
>  {
> -	unsigned limit = 2;
> +	unsigned limit;
>  
>  	/*
> -	 * x86 is limited to one NMI running, and one NMI pending after it.
> -	 * If an NMI is already in progress, limit further NMIs to just one.
> -	 * Otherwise, allow two (and we'll inject the first one immediately).
> +	 * x86 is limited to one NMI pending, but because KVM can't react to
> +	 * incoming NMIs as quickly as bare metal, e.g. if the vCPU is
> +	 * scheduled out, KVM needs to play nice with two queued NMIs showing
> +	 * up at the same time.  To handle this scenario, allow two NMIs to be
> +	 * (temporarily) pending so long as NMIs are not blocked and KVM is not
> +	 * waiting for a previous NMI injection to complete (which effectively
> +	 * blocks NMIs).  KVM will immediately inject one of the two NMIs, and
> +	 * will request an NMI window to handle the second NMI.
>  	 */
>  	if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected)
>  		limit = 1;
> +	else
> +		limit = 2;
>  
>  	vcpu->arch.nmi_pending += atomic_xchg(&vcpu->arch.nmi_queued, 0);
>  	vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit);
> 

Looks good to me, will include in v3.

Thanks,
Santosh