lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <68340081-0094-4A74-9B33-3431F39659AA@oracle.com>
Date:   Fri, 27 Sep 2019 03:06:02 +0300
From:   Liran Alon <liran.alon@...cle.com>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Reto Buerki <reet@...elabs.ch>
Subject: Re: [PATCH 1/2] KVM: nVMX: Always write vmcs02.GUEST_CR3 during
 nested VM-Enter



> On 27 Sep 2019, at 0:43, Sean Christopherson <sean.j.christopherson@...el.com> wrote:
> 
> Write the desired L2 CR3 into vmcs02.GUEST_CR3 during nested VM-Enter
> isntead of deferring the VMWRITE until vmx_set_cr3().  If the VMWRITE
> is deferred, then KVM can consume a stale vmcs02.GUEST_CR3 when it
> refreshes vmcs12->guest_cr3 during nested_vmx_vmexit() if the emulated
> VM-Exit occurs without actually entering L2, e.g. if the nested run
> is squashed because L2 is being put into HLT.

I would rephrase to “If an emulated VMEntry is squashed because L1 sets vmcs12->guest_activity_state to HLT”.
I think it’s a bit more explicit.

> 
> In an ideal world where EPT *requires* unrestricted guest (and vice
> versa), VMX could handle CR3 similar to how it handles RSP and RIP,
> e.g. mark CR3 dirty and conditionally load it at vmx_vcpu_run().  But
> the unrestricted guest silliness complicates the dirty tracking logic
> to the point that explicitly handling vmcs02.GUEST_CR3 during nested
> VM-Enter is a simpler overall implementation.
> 
> Cc: stable@...r.kernel.org
> Reported-by: Reto Buerki <reet@...elabs.ch>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
> ---
> arch/x86/kvm/vmx/nested.c | 8 ++++++++
> arch/x86/kvm/vmx/vmx.c    | 9 ++++++---
> 2 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 41abc62c9a8a..971a24134081 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2418,6 +2418,14 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> 				entry_failure_code))
> 		return -EINVAL;
> 
> +	/*
> +	 * Immediately write vmcs02.GUEST_CR3.  It will be propagated to vmcs12
> +	 * on nested VM-Exit, which can occur without actually running L2, e.g.
> +	 * if L2 is entering HLT state, and thus without hitting vmx_set_cr3().
> +	 */

If I understand correctly, it’s not exactly if L2 is entering HLT state in general.
(E.g. issue doesn’t occur if L2 runs HLT directly which is not configured to be intercepted by vmcs12).
It’s specifically when L1 enters L2 with a HLT guest-activity-state. I suggest rephrasing comment.

> +	if (enable_ept)
> +		vmcs_writel(GUEST_CR3, vmcs12->guest_cr3);
> +
> 	/* Late preparation of GUEST_PDPTRs now that EFER and CRs are set. */
> 	if (load_guest_pdptrs_vmcs12 && nested_cpu_has_ept(vmcs12) &&
> 	    is_pae_paging(vcpu)) {
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d4575ffb3cec..b530950a9c2b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2985,6 +2985,7 @@ void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> {
> 	struct kvm *kvm = vcpu->kvm;
> 	unsigned long guest_cr3;
> +	bool skip_cr3 = false;
> 	u64 eptp;
> 
> 	guest_cr3 = cr3;
> @@ -3000,15 +3001,17 @@ void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 			spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> 		}
> 
> -		if (enable_unrestricted_guest || is_paging(vcpu) ||
> -		    is_guest_mode(vcpu))
> +		if (is_guest_mode(vcpu))
> +			skip_cr3 = true;
> +		else if (enable_unrestricted_guest || is_paging(vcpu))
> 			guest_cr3 = kvm_read_cr3(vcpu);
> 		else
> 			guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
> 		ept_load_pdptrs(vcpu);
> 	}
> 
> -	vmcs_writel(GUEST_CR3, guest_cr3);
> +	if (!skip_cr3)

Nit: It’s a matter of taste, but I prefer positive conditions. i.e. “bool write_guest_cr3”.

Anyway, code seems valid to me. Nice catch.
Reviewed-by: Liran Alon <liran.alon@...cle.com>

-Liran

> +		vmcs_writel(GUEST_CR3, guest_cr3);
> }
> 
> int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> -- 
> 2.22.0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ