lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67f40465-95de-3523-f6f2-09e980dd40d7@redhat.com>
Date:   Tue, 25 Jul 2017 12:55:16 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Wanpeng Li <kernellwp@...il.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     Radim Krčmář <rkrcmar@...hat.com>,
        Wanpeng Li <wanpeng.li@...mail.com>
Subject: Re: [PATCH v2] KVM: nVMX: Fix losing NMI blocking state

On 25/07/2017 12:40, Wanpeng Li wrote:
> Commit 4c4a6f790ee862 (KVM: nVMX: track NMI blocking state separately for each VMCS)
> tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough:
> 
>  - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the 
>    L2 can generate #PF which can be intercepted by L0. 
>  - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 
>    guest and injects the #PF into L1.
>  - L1 awares it should set bit 3 (blocking by NMI) in the interruptibility-state field 
>    and fix the shadow page table before resuming L2 guest.
>  - L1 executes VMRESUME to resume L2 which generates vmexit and causes L1 exit to L0 
>  - L0 emulates VMRESUME which is called from L1, however, it lost the interruptibility 
>    state field which is updated in vmcs12 when prepare vmcs02
>  - .........

The "..." part is not very enlightening.  My understanding is:

 - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault 
   on IRET, so the L2 can generate #PF which can be intercepted by L0.
 - L0 walks L1's guest page table and sees the mapping is invalid, it 
   resumes the L1 guest and injects the #PF into L1.  At this point the
   vmcs02 has nmi_known_unmasked=true.
 - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
   of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
 - L1 executes VMRESUME to resume L2, causing a vmexit to L0
 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
   interruptibility-state field of vmcs02, but nmi_known_unmasked is
   still true.
 - on the next L2 exit to L0, nmi_known_unmasked is true so
   vmx_recover_nmi_blocking does not do anything.

Can you explain instead what happens if your v1 patch is applied (on top of mine),
and why it fixes the bug.

The patch is correct and almost obvious, but I'd like the commit message to be precise.

(Also, does your machine have shadow VMCS support?)

Thanks,

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ