linux-kernel - Re: [PATCH] KVM: X86: fix tlb_flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <78ad9dff-9a20-c17f-cd8f-931090834133@redhat.com>
Date:   Thu, 27 May 2021 14:55:56 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Lai Jiangshan <jiangshanlai@...il.com>,
        linux-kernel@...r.kernel.org
Cc:     Lai Jiangshan <laijs@...ux.alibaba.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        kvm@...r.kernel.org
Subject: Re: [PATCH] KVM: X86: fix tlb_flush_guest()

On 27/05/21 04:39, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@...ux.alibaba.com>
> 
> For KVM_VCPU_FLUSH_TLB used in kvm_flush_tlb_multi(), the guest expects
> the hypervisor do the operation that equals to native_flush_tlb_global()
> or invpcid_flush_all() in the specified guest CPU.
> 
> When TDP is enabled, there is no problem to just flush the hardware
> TLB of the specified guest CPU.
> 
> But when using shadowpaging, the hypervisor should have to sync the
> shadow pagetable at first before flushing the hardware TLB so that
> it can truely emulate the operation of invpcid_flush_all() in guest.

Can you explain why?

Also it is simpler to handle this in kvm_vcpu_flush_tlb_guest, using "if 
(tdp_enabled).  This provides also a single, good place to add a comment 
with the explanation of what invalid entries KVM_REQ_RELOAD is presenting.

Paolo

> The problem exists since the first implementation of KVM_VCPU_FLUSH_TLB
> in commit f38a7b75267f ("KVM: X86: support paravirtualized help for TLB
> shootdowns").  But I don't think it would be a real world problem that
> time since the local CPU's tlb is flushed at first in guest before queuing
> KVM_VCPU_FLUSH_TLB to other CPUs.  It means that the hypervisor syncs the
> shadow pagetable before seeing the corresponding KVM_VCPU_FLUSH_TLBs.
> 
> After commit 4ce94eabac16 ("x86/mm/tlb: Flush remote and local TLBs
> concurrently"), the guest doesn't flush local CPU's tlb at first and
> the hypervisor can handle other VCPU's KVM_VCPU_FLUSH_TLB earlier than
> local VCPU's tlb flush and might flush the hardware tlb without syncing
> the shadow pagetable beforehand.
> 
> Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns")
> Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
> ---
>   arch/x86/kvm/svm/svm.c | 16 +++++++++++++++-
>   arch/x86/kvm/vmx/vmx.c |  8 +++++++-
>   2 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 05eca131eaf2..f4523c859245 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3575,6 +3575,20 @@ void svm_flush_tlb(struct kvm_vcpu *vcpu)
>   		svm->current_vmcb->asid_generation--;
>   }
>   
> +static void svm_flush_tlb_guest(struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * When NPT is enabled, just flush the ASID.
> +	 *
> +	 * When NPT is not enabled, the operation should be equal to
> +	 * native_flush_tlb_global(), invpcid_flush_all() in guest.
> +	 */
> +	if (npt_enabled)
> +		svm_flush_tlb(vcpu);
> +	else
> +		kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
> +}
> +
>   static void svm_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t gva)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
> @@ -4486,7 +4500,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>   	.tlb_flush_all = svm_flush_tlb,
>   	.tlb_flush_current = svm_flush_tlb,
>   	.tlb_flush_gva = svm_flush_tlb_gva,
> -	.tlb_flush_guest = svm_flush_tlb,
> +	.tlb_flush_guest = svm_flush_tlb_guest,
>   
>   	.run = svm_vcpu_run,
>   	.handle_exit = handle_exit,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4bceb5ca3a89..1913504e3472 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3049,8 +3049,14 @@ static void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu)
>   	 * are required to flush GVA->{G,H}PA mappings from the TLB if vpid is
>   	 * disabled (VM-Enter with vpid enabled and vpid==0 is disallowed),
>   	 * i.e. no explicit INVVPID is necessary.
> +	 *
> +	 * When EPT is not enabled, the operation should be equal to
> +	 * native_flush_tlb_global(), invpcid_flush_all() in guest.
>   	 */
> -	vpid_sync_context(to_vmx(vcpu)->vpid);
> +	if (enable_ept)
> +		vpid_sync_context(to_vmx(vcpu)->vpid);
> +	else
> +		kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
>   }
>   
>   void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu)
>