lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Apr 2017 17:02:58 -0500
From:   Brijesh Singh <brijesh.singh@....com>
To:     Radim Krčmář <rkrcmar@...hat.com>
CC:     <brijesh.singh@....com>, <pbonzini@...hat.com>, <joro@...tes.org>,
        <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <tglx@...utronix.de>, <mingo@...hat.com>, <hpa@...or.com>,
        <x86@...nel.org>, <Thomas.Lendacky@....com>
Subject: Re: [PATCH] x86: kvm: Avoid guest page table walk when gpa_available
 is set


>>
>> I also wanted to avoid adding yet another variable but we can't depend on
>> cr2 parameters passed into x86_emulate_instruction().
>>
>> The x86_emulate_instruction() function is called from two places:
>>
>> 1) handling the page-fault.
>> pf_interception [svm.c]
>>  kvm_mmu_page_fault [mmu.c]
>>   x86_emulate_instruction [x86.c]
>>
>> 2) completing the IO/MMIO's from previous instruction decode
>> kvm_arch_vcpu_ioctl_run
>>  complete_emulated_io
>>    emulate_instruction
>>     x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0)
>>
>> In #1, we are guaranteed that cr2 variable will contain a valid GPA but
>> in #2, CR2 is set to zero.
>
> We are setting up the completion in #1 x86_emulate_instruction(), where
> the gpa (cr2) is available, so we could store the value while arming
> vcpu->arch.complete_userspace_io.
>
> emulator_read_write_onepage() already saves gpa in frag->gpa, which is
> then passed into complete_emulated_mmio -- isn't that mechanism
> sufficient?
>

I see that complete_emulated_mmio() saves the frag>gpa into run->mmio.phys_addr,
so based on the exit_reason we should be able to get the saved gpa.
In my debug patch below, I tried doing something similar to verify that frag->gpa
contains the valid CR2 value but I saw a bunch of mismatch. So it seems like we
may not able to use frag->gpa mechanism. Additionally we also need to handle the
PIO cases (e.g what if we are called from complete_emulated_pio), which also takes
similar code path

complete_emulated_pio
  completed_emulated_io
    emulate_instruction
     x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0)

@@ -5682,13 +5686,20 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
  
  restart:
         /*
-        * Save the faulting GPA (cr2) in the address field
-        * NOTE: If gpa_available is set then gpa_val will contain a valid GPA
+        * if previous exit was due to userspace mmio completion then actual
+        * cr2 is stored in mmio.phys_addr.
          */
-       if (vcpu->arch.gpa_available)
-               ctxt->exception.address = vcpu->arch.gpa_val;
-       else
-               ctxt->exception.address = cr2;
+       if (vcpu->run->exit_reason == KVM_EXIT_MMIO) {
+               cr2 = vcpu->run->mmio.phys_addr;
+               if (cr2 != vcpu->arch.gpa_val)
+                       pr_err("** mismatch %llx %llx\n",
+                               vcpu->run->mmio.phys_addr, vcpu->arch.gpa_val);
+       }
+
+        /* Save the faulting GPA (cr2) in the address field */
+       ctxt->exception.address = cr2;


>>
>> handle_exit [svm.c]
>>  pf_interception  [svm.c]
>> /* it invokes the fault handler with CR2 = svm->vmcb->control.exit_info_2 */
>>   kvm_mmu_page_fault [mmu.c]
>>    x86_emulate_instruction  [x86.c]
>>     emulator_read_write_onepage  [x86.c]
>>      /*
>>       *this is where we walk the guest page table to translate
>>       * a GVA to GPA. If gpa_available is set then we use the
>>       * gpa_val instead of walking the pgtable.
>>       */
>
> pf_interception is the NPF exit handler -- please move the setting
> there, at least.  handle_exit() is a hot path that shouldn't contain
> code that isn't applicable to all exits.
>

Sure, Will do.

> Btw. we need some other guarantees to declare it as GPA (cr2 is GPA in
> NPT exits, but might not be in other) ... isn't arch.mmu.direct_map a
> condition we are interested in?
>
> The other code uses it to interpret cr2 directly as gpa, so we might be
> able to avoid setting the arch.gpa_available in a hot path too.
>

Hmm looking at the call trace I am not sure how arch.mmu_direct_map will help
but I will investigate a bit more.

>>
>> See my previous comment. In some cases CR2 may be set to zero
>> (e.g when completing the instruction from previous io/mmio page-fault).
>>
>> If we are decide to add the gpa_val then we can remove above if
>> statement from x86_emulate_instruction() and update emulator_read_write_onepage
>> to use the vcpu->arch.gpa_val instead of exception->address.
>
> Yeah, that would be nicer than setting exception->address at a random
> place.
>
> We could also pass the value as cr2 in all cases if it made something
> better.
>
>> if (vcpu->arch.gpa_available &&
>>      emulator_can_use_gpa(ctxt) &&
>>      (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
>>       gpa = vcpu=>arch.gpa_val;
>>       ...
>>       ...
>>    }
>
> If at all possible, I'd like to have the gpa passed with other relevant
> data, instead of having it isolated like this ... and we can't manage
> that, then at least good benchmark results to excuse the bad code.
>

I ran two tests to see how many times we walk  guest page table.

Test1: run kvm-unit-test
Test2: launch Ubuntu 16.06 guest, run a stressapptest for 20 seconds, shutdown the VM

Before patch
   * Test1: 10419
   * Test2: 243365

After patch:
   * Test1:  1259
   * Test2:  1221

Please let me know if you want me to run other other benchmark and capture the data.

-Brijesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ