lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Mar 2017 21:44:57 +0800
From:   Wanpeng Li <kernellwp@...il.com>
To:     Ladi Prosek <lprosek@...hat.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        KVM list <kvm@...r.kernel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Wanpeng Li <wanpeng.li@...mail.com>
Subject: Re: [PATCH] KVM: nVMX: Fix L2 guest hang if shadow page tables on EPT

2017-03-22 20:00 GMT+08:00 Ladi Prosek <lprosek@...hat.com>:
> On Sat, Mar 18, 2017 at 7:37 AM, Wanpeng Li <kernellwp@...il.com> wrote:
>> 2017-03-18 1:28 GMT+08:00 Ladi Prosek <lprosek@...hat.com>:
>>> On Fri, Mar 17, 2017 at 3:41 PM, Wanpeng Li <kernellwp@...il.com> wrote:
>>>> From: Wanpeng Li <wanpeng.li@...mail.com>
>>>>
>>>> The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that
>>>> L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly:
>>>>
>>>> qemu-system-x86-2821  [003] d..2    45.848814: kvm_entry: vcpu 0
>>>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e
>>>> qemu-system-x86-2821  [003] ...1    45.848827: kvm_page_fault: address fe05b error_code 14
>>>>
>>>> Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT)
>>>> prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is
>>>> uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE
>>>> paging and EPT enabled. However, there is a progress to switch from Legacy
>>>> mode's such-mode Protected mode to Long mode during system boot, the check
>>>> in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in
>>>> Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually
>>>> the original commit should just intended to prevent to dereference L2's CR3
>>>> if the L1 hypervisor emulates L2's real mode through vm8086.
>>>>
>>>> This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and
>>>> !vm86_active.
>>>>
>>>> Cc: Paolo Bonzini <pbonzini@...hat.com>
>>>> Cc: Radim Krčmář <rkrcmar@...hat.com>
>>>> Cc: Ladi Prosek <lprosek@...hat.com>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index c664365..2b2a05f 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
>>>>  static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept,
>>>>                                u32 *entry_failure_code)
>>>>  {
>>>> -       if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
>>>> +       if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) {
>>>>                 if (!nested_cr3_valid(vcpu, cr3)) {
>>>>                         *entry_failure_code = ENTRY_FAIL_DEFAULT;
>>>>                         return 1;
>>>> @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
>>>>                  * must not be dereferenced.
>>>>                  */
>>>>                 if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
>>>> -                   !nested_ept) {
>>>> +                   !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
>>>
>>> This change breaks Hyper-V on KVM. L2 hangs on start-up, same symptoms
>>> as before 7ca29de21362.
>>
>> Hmm, I miss the function pdptrs_changed() will also dereference CR3.
>> How about something like this:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c664365..d7ebf03 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -9933,7 +9933,9 @@ static bool nested_cr3_valid(struct kvm_vcpu
>> *vcpu, unsigned long val)
>>  static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long
>> cr3, bool nested_ept,
>>                     u32 *entry_failure_code)
>>  {
>> -    if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
>> +    if (cr3 != kvm_read_cr3(vcpu) ||
>> +        (!(nested_ept && to_vmx(vcpu)->rmode.vm86_active) &&
>> +        pdptrs_changed(vcpu))) {
>>          if (!nested_cr3_valid(vcpu, cr3)) {
>>              *entry_failure_code = ENTRY_FAIL_DEFAULT;
>>              return 1;
>> @@ -9944,7 +9946,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu
>> *vcpu, unsigned long cr3, bool ne
>>           * must not be dereferenced.
>>           */
>>          if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
>> -            !nested_ept) {
>> +            !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
>>              if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) {
>>                  *entry_failure_code = ENTRY_FAIL_PDPTE;
>>                  return 1;
>
> Still the same, Hyper-V is broken. The problem is not in real vs.
> protected mode. The way nested_ept_enabled is computed is incorrect.
>
> I can run both Hyper-V and KVM with EPT = 0 in L1 with this patch. Can
> you please give it a try?
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 98e82ee..9145c94 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -10121,7 +10121,7 @@ static int prepare_vmcs02(struct kvm_vcpu
> *vcpu, struct vmcs12 *vmcs12,
>                                 vmcs12->guest_intr_status);
>                 }
>
> -               nested_ept_enabled = (exec_control &
> SECONDARY_EXEC_ENABLE_EPT) != 0;
> +               nested_ept_enabled =
> (vmcs12->secondary_vm_exec_control & SECONDARY_EXEC_ENABLE_EPT) != 0;
>
>                 /*
>                  * Write an illegal value to APIC_ACCESS_ADDR. Later,

You are right, it works. Please send out a formal patch and add the
kvm-unit-tests as Paolo mentioned.

Regards,
Wanpeng Li

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ