[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e453f35-cd26-1df8-5f8e-68fa09c6a1a3@amd.com>
Date: Tue, 25 Apr 2017 17:02:58 -0500
From: Brijesh Singh <brijesh.singh@....com>
To: Radim Krčmář <rkrcmar@...hat.com>
CC: <brijesh.singh@....com>, <pbonzini@...hat.com>, <joro@...tes.org>,
<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<tglx@...utronix.de>, <mingo@...hat.com>, <hpa@...or.com>,
<x86@...nel.org>, <Thomas.Lendacky@....com>
Subject: Re: [PATCH] x86: kvm: Avoid guest page table walk when gpa_available
is set
>>
>> I also wanted to avoid adding yet another variable but we can't depend on
>> cr2 parameters passed into x86_emulate_instruction().
>>
>> The x86_emulate_instruction() function is called from two places:
>>
>> 1) handling the page-fault.
>> pf_interception [svm.c]
>> kvm_mmu_page_fault [mmu.c]
>> x86_emulate_instruction [x86.c]
>>
>> 2) completing the IO/MMIO's from previous instruction decode
>> kvm_arch_vcpu_ioctl_run
>> complete_emulated_io
>> emulate_instruction
>> x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0)
>>
>> In #1, we are guaranteed that cr2 variable will contain a valid GPA but
>> in #2, CR2 is set to zero.
>
> We are setting up the completion in #1 x86_emulate_instruction(), where
> the gpa (cr2) is available, so we could store the value while arming
> vcpu->arch.complete_userspace_io.
>
> emulator_read_write_onepage() already saves gpa in frag->gpa, which is
> then passed into complete_emulated_mmio -- isn't that mechanism
> sufficient?
>
I see that complete_emulated_mmio() saves the frag>gpa into run->mmio.phys_addr,
so based on the exit_reason we should be able to get the saved gpa.
In my debug patch below, I tried doing something similar to verify that frag->gpa
contains the valid CR2 value but I saw a bunch of mismatch. So it seems like we
may not able to use frag->gpa mechanism. Additionally we also need to handle the
PIO cases (e.g what if we are called from complete_emulated_pio), which also takes
similar code path
complete_emulated_pio
completed_emulated_io
emulate_instruction
x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0)
@@ -5682,13 +5686,20 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
restart:
/*
- * Save the faulting GPA (cr2) in the address field
- * NOTE: If gpa_available is set then gpa_val will contain a valid GPA
+ * if previous exit was due to userspace mmio completion then actual
+ * cr2 is stored in mmio.phys_addr.
*/
- if (vcpu->arch.gpa_available)
- ctxt->exception.address = vcpu->arch.gpa_val;
- else
- ctxt->exception.address = cr2;
+ if (vcpu->run->exit_reason == KVM_EXIT_MMIO) {
+ cr2 = vcpu->run->mmio.phys_addr;
+ if (cr2 != vcpu->arch.gpa_val)
+ pr_err("** mismatch %llx %llx\n",
+ vcpu->run->mmio.phys_addr, vcpu->arch.gpa_val);
+ }
+
+ /* Save the faulting GPA (cr2) in the address field */
+ ctxt->exception.address = cr2;
>>
>> handle_exit [svm.c]
>> pf_interception [svm.c]
>> /* it invokes the fault handler with CR2 = svm->vmcb->control.exit_info_2 */
>> kvm_mmu_page_fault [mmu.c]
>> x86_emulate_instruction [x86.c]
>> emulator_read_write_onepage [x86.c]
>> /*
>> *this is where we walk the guest page table to translate
>> * a GVA to GPA. If gpa_available is set then we use the
>> * gpa_val instead of walking the pgtable.
>> */
>
> pf_interception is the NPF exit handler -- please move the setting
> there, at least. handle_exit() is a hot path that shouldn't contain
> code that isn't applicable to all exits.
>
Sure, Will do.
> Btw. we need some other guarantees to declare it as GPA (cr2 is GPA in
> NPT exits, but might not be in other) ... isn't arch.mmu.direct_map a
> condition we are interested in?
>
> The other code uses it to interpret cr2 directly as gpa, so we might be
> able to avoid setting the arch.gpa_available in a hot path too.
>
Hmm looking at the call trace I am not sure how arch.mmu_direct_map will help
but I will investigate a bit more.
>>
>> See my previous comment. In some cases CR2 may be set to zero
>> (e.g when completing the instruction from previous io/mmio page-fault).
>>
>> If we are decide to add the gpa_val then we can remove above if
>> statement from x86_emulate_instruction() and update emulator_read_write_onepage
>> to use the vcpu->arch.gpa_val instead of exception->address.
>
> Yeah, that would be nicer than setting exception->address at a random
> place.
>
> We could also pass the value as cr2 in all cases if it made something
> better.
>
>> if (vcpu->arch.gpa_available &&
>> emulator_can_use_gpa(ctxt) &&
>> (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
>> gpa = vcpu=>arch.gpa_val;
>> ...
>> ...
>> }
>
> If at all possible, I'd like to have the gpa passed with other relevant
> data, instead of having it isolated like this ... and we can't manage
> that, then at least good benchmark results to excuse the bad code.
>
I ran two tests to see how many times we walk guest page table.
Test1: run kvm-unit-test
Test2: launch Ubuntu 16.06 guest, run a stressapptest for 20 seconds, shutdown the VM
Before patch
* Test1: 10419
* Test2: 243365
After patch:
* Test1: 1259
* Test2: 1221
Please let me know if you want me to run other other benchmark and capture the data.
-Brijesh
Powered by blists - more mailing lists