lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 27 May 2022 17:55:08 +0800
From:   "Wang, Lei" <lei4.wang@...el.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     pbonzini@...hat.com, vkuznets@...hat.com, wanpengli@...cent.com,
        jmattson@...gle.com, joro@...tes.org, chenyi.qiang@...el.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 8/8] KVM: VMX: Enable PKS for nested VM

On 5/20/2022 9:24 AM, Sean Christopherson wrote:
> Nit, use "KVM: nVMX:" for the shortlog scope.

Will change it.

> On Sun, Apr 24, 2022, Lei Wang wrote:
>> @@ -2433,6 +2437,10 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
>>   		if (kvm_mpx_supported() && vmx->nested.nested_run_pending &&
>>   		    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
>>   			vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
>> +
>> +		if (vmx->nested.nested_run_pending &&
>> +		    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS))
>> +			vmcs_write64(GUEST_IA32_PKRS, vmcs12->guest_ia32_pkrs);
> As mentioned in the BNDCFGS thread, this does the wrong thing for SMM.  But, after
> a lot of thought, handling this in nested_vmx_enter_non_root_mode() would be little
> more than a band-aid, and a messy one at that, because KVM's SMM emulation is
> horrifically broken with respect to nVMX.
>
> Entry does to SMM does not modify _any_ state that is not saved in SMRAM.  That
> we're having to deal with this crap is a symptom of KVM doing the complete wrong
> thing by piggybacking nested_vmx_vmexit() and nested_vmx_enter_non_root_mode().
>
> The SDM's description of CET spells this out very, very clearly:
>
>    On processors that support CET shadow stacks, when the processor enters SMM,
>    the processor saves the SSP register to the SMRAM state save area (see Table 31-3)
>    and clears CR4.CET to 0. Thus, the initial execution environment of the SMI handler
>    has CET disabled and all of the CET state of the interrupted program is still in the
>    machine. An SMM that uses CET is required to save the interrupted program’s CET
>    state and restore the CET state prior to exiting SMM.
>
> It mostly works because no guest SMM handler does anything with most of the MSRs,
> but it's all wildy wrong.  A concrete example of a lurking bug is if vmcs12 uses
> the VM-Exit MSR load list, in which case the forced nested_vmx_vmexit() will load
> state that is never undone.
>
> So, my very strong vote is to ignore SMM and let someone who actually cares about
> SMM fix that mess properly by adding custom flows for exiting/re-entering L2 on
> SMI/RSM.

OK, I will leave the mess alone.

>>   	}
>>   
>>   	if (nested_cpu_has_xsaves(vmcs12))
>> @@ -2521,6 +2529,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
>>   	if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending ||
>>   	    !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
>>   		vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs);
>> +	if (kvm_cpu_cap_has(X86_FEATURE_PKS) &&
> ERROR: trailing whitespace
> #85: FILE: arch/x86/kvm/vmx/nested.c:3407:
> +^Iif (kvm_cpu_cap_has(X86_FEATURE_PKS) && $

Sorry for my carelessness, will remove the trailing whitespace.

>> +	    (!vmx->nested.nested_run_pending ||
>> +	     !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS)))
>> +		vmcs_write64(GUEST_IA32_PKRS, vmx->nested.vmcs01_guest_pkrs);
>> +
>>   	vmx_set_rflags(vcpu, vmcs12->guest_rflags);
>>   
>>   	/* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the
>> @@ -2897,6 +2910,10 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
>>   					   vmcs12->host_ia32_perf_global_ctrl)))
>>   		return -EINVAL;
>>   
>> +	if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PKRS) &&
>> +	    CC(!kvm_pkrs_valid(vmcs12->host_ia32_pkrs)))
>> +		return -EINVAL;
>> +
>>   #ifdef CONFIG_X86_64
>>   	ia32e = !!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE);
>>   #else
>> @@ -3049,6 +3066,10 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
>>   	if (nested_check_guest_non_reg_state(vmcs12))
>>   		return -EINVAL;
>>   
>> +	if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS) &&
>> +	    CC(!kvm_pkrs_valid(vmcs12->guest_ia32_pkrs)))
>> +		return -EINVAL;
>> +
>>   	return 0;
>>   }
>>   
>> @@ -3384,6 +3405,10 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>>   	    (!from_vmentry ||
>>   	     !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
>>   		vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
>> +	if (kvm_cpu_cap_has(X86_FEATURE_PKS) &&
>> +	    (!from_vmentry ||
> This should be "!vmx->nested.nested_run_pending" instead of "!from_vmentry" to
> avoid the unnecessary VMREAD when restoring L2 with a pending VM-Enter.

Will fix that.

>> +	     !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS)))
>> +		vmx->nested.vmcs01_guest_pkrs = vmcs_read64(GUEST_IA32_PKRS);
>>   
>>   	/*
>>   	 * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and*
> ...
>
>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>> index 91723a226bf3..82f79ac46d7b 100644
>> --- a/arch/x86/kvm/vmx/vmx.h
>> +++ b/arch/x86/kvm/vmx/vmx.h
>> @@ -222,6 +222,8 @@ struct nested_vmx {
>>   	u64 vmcs01_debugctl;
>>   	u64 vmcs01_guest_bndcfgs;
>>   
> Please pack these together, i.e. don't have a blank line between the various
> vmcs01_* fields.

OK, will check them and remove the blank lines.

Powered by blists - more mailing lists