lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a02522c-50a1-0aa2-879c-98ba7631ffbe@gmail.com>
Date:   Fri, 12 May 2017 15:38:03 +0800
From:   Xiao Guangrong <guangrong.xiao@...il.com>
To:     Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     Peter Feiner <pfeiner@...gle.com>,
        David Matlack <dmatlack@...gle.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Xiao Guangrong <xiaoguangrong@...cent.com>,
        Wanpeng Li <wanpeng.li@...mail.com>, kevin.tian@...el.com
Subject: Re: [PATCH 2/2] KVM: nVMX: fix nEPT handling of guest page table
 accesses


CC Kevin as i am not sure if Intel is aware of this issue, it
breaks other hypervisors, e.g, Xen, as swell.

On 05/11/2017 07:23 PM, Paolo Bonzini wrote:
> The new ept_access_test_paddr_read_only_ad_disabled testcase
> caused an infinite stream of EPT violations because KVM did not
> find anything bad in the page tables and kept re-executing the
> faulting instruction.
> 
> This is because the exit qualification said we were reading from
> the page tables, but actually writing the cause of the EPT violation
> was writing the A/D bits.  This happened even with eptad=0, quite
> surprisingly.
> 
> Thus, always treat guest page table accesses as read+write operations,
> even if the exit qualification says otherwise.  This fixes the
> testcase.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
>   arch/x86/kvm/vmx.c | 36 +++++++++++++++++++++++-------------
>   1 file changed, 23 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c6f4ad44aa95..c868cbdad29a 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6209,17 +6209,19 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
>   	u32 error_code;
>   
>   	exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
> +	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
> +	trace_kvm_page_fault(gpa, exit_qualification);
>   
> -	if (is_guest_mode(vcpu)
> -	    && !(exit_qualification & EPT_VIOLATION_GVA_TRANSLATED)) {
> -		/*
> -		 * Fix up exit_qualification according to whether guest
> -		 * page table accesses are reads or writes.
> -		 */
> -		u64 eptp = nested_ept_get_cr3(vcpu);
> -		if (!(eptp & VMX_EPT_AD_ENABLE_BIT))
> -			exit_qualification &= ~EPT_VIOLATION_ACC_WRITE;
> -	}
> +	/*
> +	 * All guest page table accesses are potential writes to A/D bits.
> +	 * but EPT microcode only reports them as such when EPT A/D is
> +	 * enabled.  Tracing ept_access_test_paddr_read_only_ad_disabled (from
> +	 * kvm-unit-tests) with eptad=0 and eptad=1 shows that the processor
> +	 * does not change its behavior when EPTP enables A/D bits; the only
> +	 * difference is in the exit qualification.  So fix this up here.
> +	 */
> +	if (!(exit_qualification & EPT_VIOLATION_GVA_TRANSLATED))
> +		exit_qualification |= EPT_VIOLATION_ACC_WRITE;
>   
>   	/*
>   	 * EPT violation happened while executing iret from NMI,
> @@ -6231,9 +6233,6 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
>   			(exit_qualification & INTR_INFO_UNBLOCK_NMI))
>   		vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, GUEST_INTR_STATE_NMI);
>   
> -	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
> -	trace_kvm_page_fault(gpa, exit_qualification);
> -
>   	/* Is it a read fault? */
>   	error_code = (exit_qualification & EPT_VIOLATION_ACC_READ)
>   		     ? PFERR_USER_MASK : 0;
> @@ -6250,6 +6249,17 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
>   		      ? PFERR_PRESENT_MASK : 0;
>   
>   	vcpu->arch.gpa_available = true;
> +
> +	if (is_guest_mode(vcpu)
> +	    && !(exit_qualification & EPT_VIOLATION_GVA_TRANSLATED)) {
> +		/*
> +		 * Now fix up exit_qualification according to what the
> +		 * L1 hypervisor expects to see.
> +		 */
> +		u64 eptp = nested_ept_get_cr3(vcpu);
> +		if (!(eptp & VMX_EPT_AD_ENABLE_BIT))
> +			exit_qualification &= ~EPT_VIOLATION_ACC_WRITE;
> +	}

I am not sure if this is really needed, it (PFEC.W = 0 if A/D need to be set on
page structures) is not we expect.

Maybe always report the right behavior is better? Especially,Intel may fix its
microcode as it hurts the newest CPUs as well.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ