linux-kernel - Re: [PATCH] KVM: nVMX: nested VPID emulation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU436-SMTP5E49A9FB8B5A20AC13EF6805C0@phx.gbl>
Date:	Tue, 15 Sep 2015 18:18:32 +0800
From:	Wanpeng Li <wanpeng.li@...mail.com>
To:	Bandan Das <bsd@...hat.com>
CC:	Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM: nVMX: nested VPID emulation

On 9/15/15 12:08 AM, Bandan Das wrote:
> Wanpeng Li <wanpeng.li@...mail.com> writes:
>
>> VPID is used to tag address space and avoid a TLB flush. Currently L0 use
>> the same VPID to run L1 and all its guests. KVM flushes VPID when switching
>> between L1 and L2.
>>
>> This patch advertises VPID to the L1 hypervisor, then address space of L1 and
>> L2 can be separately treated and avoid TLB flush when swithing between L1 and
>> L2. This patch gets ~3x performance improvement for lmbench 8p/64k ctxsw.
> TLB flush does context invalidation and while that should result in
> some improvement, I never expected a 3x improvement for any workload!
> Interesting :)

The result still looks good when test v2.

>
>> Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
>> ---
>>   arch/x86/kvm/vmx.c | 39 ++++++++++++++++++++++++++++++++-------
>>   1 file changed, 32 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index da1590e..06bc31e 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -1157,6 +1157,11 @@ static inline bool nested_cpu_has_virt_x2apic_mode(struct vmcs12 *vmcs12)
>>   	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
>>   }
>>   
>> +static inline bool nested_cpu_has_vpid(struct vmcs12 *vmcs12)
>> +{
>> +	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VPID);
>> +}
>> +
>>   static inline bool nested_cpu_has_apic_reg_virt(struct vmcs12 *vmcs12)
>>   {
>>   	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_APIC_REGISTER_VIRT);
>> @@ -2471,6 +2476,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>>   		SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
>>   		SECONDARY_EXEC_RDTSCP |
>>   		SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
>> +		SECONDARY_EXEC_ENABLE_VPID |
>>   		SECONDARY_EXEC_APIC_REGISTER_VIRT |
>>   		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>>   		SECONDARY_EXEC_WBINVD_EXITING |
>> @@ -4160,7 +4166,7 @@ static void allocate_vpid(struct vcpu_vmx *vmx)
>>   	int vpid;
>>   
>>   	vmx->vpid = 0;
>> -	if (!enable_vpid)
>> +	if (!enable_vpid || is_guest_mode(&vmx->vcpu))
>>   		return;
>>   	spin_lock(&vmx_vpid_lock);
>>   	vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS);
>> @@ -6738,6 +6744,14 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
>>   	}
>>   	vmcs12 = kmap(page);
>>   	vmcs12->launch_state = 0;
>> +	if (enable_vpid) {
>> +		if (nested_cpu_has_vpid(vmcs12)) {
>> +			spin_lock(&vmx_vpid_lock);
>> +			if (vmcs12->virtual_processor_id != 0)
>> +				__clear_bit(vmcs12->virtual_processor_id, vmx_vpid_bitmap);
>> +			spin_unlock(&vmx_vpid_lock);
>> +		}
>> +	}
>>   	kunmap(page);
>>   	nested_release_page(page);
> I don't think this is enough, we should also check for set "nested" bits
> in free_vpid() and clear them. There should be some sort of a mapping between the
> nested guest vpid and the actual vpid so that we can just clear those bits.

Agreed.

>
>> @@ -9189,6 +9203,7 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>   {
>>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
>>   	u32 exec_control;
>> +	int vpid;
>>   
>>   	vmcs_write16(GUEST_ES_SELECTOR, vmcs12->guest_es_selector);
>>   	vmcs_write16(GUEST_CS_SELECTOR, vmcs12->guest_cs_selector);
>> @@ -9438,13 +9453,21 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>   	else
>>   		vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);
>>   
>> +
> Empty space here.
>
>>   	if (enable_vpid) {
>> -		/*
>> -		 * Trivially support vpid by letting L2s share their parent
>> -		 * L1's vpid. TODO: move to a more elaborate solution, giving
>> -		 * each L2 its own vpid and exposing the vpid feature to L1.
>> -		 */
>> -		vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
>> +		if (nested_cpu_has_vpid(vmcs12)) {
>> +			if (vmcs12->virtual_processor_id == 0) {
> Ok, so if we advertise vpid to the nested hypervisor, isn't it going to
> attempt writing this field when setting up ? Atleast
> that's what Linux does, no ?

Agreed, I do the allocation of vpid02 during initialization in v2.

>
>> +				spin_lock(&vmx_vpid_lock);
>> +				vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS);
>> +				if (vpid < VMX_NR_VPIDS)
>> +					__set_bit(vpid, vmx_vpid_bitmap);
>> +				spin_unlock(&vmx_vpid_lock);
>> +				vmcs_write16(VIRTUAL_PROCESSOR_ID, vpid);
>> +			} else
>> +				vmcs_write16(VIRTUAL_PROCESSOR_ID, vmcs12->virtual_processor_id);
>> +		} else
>> +			vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
>> +
> I guess L1 shouldn't know what vpid L0 chose to run L2. If L1 vmreads,
> it should get what it expects for the value of vpid, not the one L0 chose.

Agreed.

>
>>   		vmx_flush_tlb(vcpu);
>>   	}
> So, this isn't removed ? I thought it's not needed anymore ?

Please review v2. :-)

Regards,
Wanpeng Li

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/