linux-kernel - Re: [PATCH v2 0/2] KVM: SVM: Create separate vmcbs for L1 and L2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1a450000-dd0c-6c38-191b-8ad869c21807@redhat.com>
Date:   Tue, 17 Nov 2020 12:03:12 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Cathy Avery <cavery@...hat.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     vkuznets@...hat.com, wei.huang2@....com, mlevitsk@...hat.com,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [PATCH v2 0/2] KVM: SVM: Create separate vmcbs for L1 and L2

Hi Cathy,

I found an issue with the second patch: the svm->asid_generation and 
svm->vcpu.cpu fields must become per-VMCB.  Once again we are 
rediscovering what VMX already does (for different reasons) with its 
struct loaded_vmcs.

The good thing is that it can be worked around easily: for the ASID 
generation, it simply be cleared after changing svm->vmcb.  For the CPU 
field, it's not an issue yet because the VMCB is marked all-dirty after 
each switch.  With these workarounds, everything works nicely.

However, removing the calls to vmcb_mark_all_dirty is the main 
optimization that is enabled by the VMCB01/VMCB02 split, so it should be 
fixed too.  And also, the code duplication every time svm->vmcb is 
assigned starts to be ugly, proving Sean to be right. :)

My suggestion is that you do something like this:

1) create a struct for all per-VMCB data:

	struct kvm_vmcb_info {
		void *ptr;
		u64 pa;
	}

and use it in structs vcpu_svm and svm_nested_state:

	struct vcpu_svm {
		...
		struct kvm_vmcb_info vmcb01;
		struct kvm_vmcb_info *current_vmcb;
		void *vmcb;
		u64 vmcb_pa;
		...
	}

	struct svm_nested_state {
		struct kvm_vmcb_info vmcb02;
		...
	}

The struct can be passed to a vmcb_switch function like this one:

	void vmcb_switch(struct vcpu_svm *svm,
			 struct kvm_vmcb_info *target_vmcb)
	{
		svm->current_vmcb = target_vmcb;
		svm->vmcb = target_vmcb->ptr;
		svm->vmcb_pa = target_vmcb->pa;

		/*
		 * Workaround: we don't yet track the ASID generation
		 * that was active the last time target_vmcb was run.
		 */
		svm->asid_generation = 0;

		/*
		 * Workaround: we don't yet track the physical CPU that
		 * target_vmcb has run on.
		 */
		vmcb_mark_all_dirty(svm->vmcb);
	}

You can use this function every time the current code is assigning to 
svm->vmcb.  Once the struct and function is in place, we can proceed to 
removing the last two (inefficient) lines of vmcb_switch by augmenting 
struct kvm_vmcb_info.

2) First, add an "int cpu" field.  Move the vcpu->cpu check from 
svm_vcpu_load to pre_svm_run, using svm->current_vmcb->cpu instead of 
vcpu->cpu, and you will be able to remove the vmcb_mark_all_dirty call 
from vmcb_switch.

3) Then do the same with asid_generation.  All uses of 
svm->asid_generation become uses of svm->current_vmcb->asid_generation, 
and you can remove the clearing of svm->asid_generation.

These can be three separate patches on top of the changes you have sent 
(or rather the rebased version, see below).  Writing good commit 
messages for them will be a good exercise too. :)

I have pushed the current nested SVM queue to kvm.git on a "nested-svm" 
branch.  You can discard the top commits and work on top of commit 
a33b86f151a0 from that branch ("KVM: SVM: Use a separate vmcb for the 
nested L2 guest", 2020-11-17).

Once this is done, we can start reaping the advantages of the 
VMCB01/VMCB02 split.  Some patches for that are already in the 
nested-svm branch, but there's more fun to be had.  First of all, 
Maxim's ill-fated attempt at using VMCB12 clean bits will now work. 
Second, we can try doing VMLOAD/VMSAVE always from VMCB01 (while VMRUN 
switches between VMCB01 and VMCB02) and therefore remove the 
nested_svm_vmloadsave calls from nested_vmrun and nested_vmexit.  But, 
one thing at a time.

Thanks,

Paolo