linux-kernel - Re: [PATCH RFC] KVM: nSVM: Fix L1 state corruption upon return from SMM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <01564b34-2476-2098-7ec8-47336922afda@redhat.com>
Date:   Wed, 23 Jun 2021 15:21:47 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Cathy Avery <cavery@...hat.com>,
        Emanuele Giuseppe Esposito <eesposit@...hat.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: nSVM: Fix L1 state corruption upon return from
 SMM

On 23/06/21 15:01, Maxim Levitsky wrote:
> I did some homework on this now and I would like to share few my thoughts on this:
> 
> First of all my attention caught the way we intercept the #SMI
> (this isn't 100% related to the bug but still worth talking about IMHO)
> 
> A. Bare metal: Looks like SVM allows to intercept SMI, with SVM_EXIT_SMI,
>   with an intention of then entering the BIOS SMM handler manually using the SMM_CTL msr.

... or just using STGI, which is what happens for KVM.  This is in the 
manual: "The hypervisor may respond to the #VMEXIT(SMI) by executing the 
STGI instruction, which causes the pending SMI to be taken immediately".

It *should* work for KVM to just not intercept SMI, but it adds more 
complexity for no particular gain.

>   On bare metal we do set the INTERCEPT_SMI but we emulate the exit as a nop.
>   I guess on bare metal there are some undocumented bits that BIOS set which
>   make the CPU to ignore that SMI intercept and still take the #SMI handler,
>   normally but I wonder if we could still break some motherboard
>   code due to that.
> 
> B. Nested: If #SMI is intercepted, then it causes nested VMEXIT.
>   Since KVM does enable SMI intercept, when it runs nested it means that all SMIs
>   that nested KVM gets are emulated as NOP, and L1's SMI handler is not run.

No, this is incorrect.  Note that svm_check_nested_events does not clear 
smi_pending the way vmx_check_nested_events does it for nmi_pending.  So 
the interrupt is still there and will be injected on the next STGI.

Paolo

> 
> About the issue that was fixed in this patch. Let me try to understand how
> it would work on bare metal:
> 
> 1. A guest is entered. Host state is saved to VM_HSAVE_PA area (or stashed somewhere
>    in the CPU)
> 
> 2. #SMI (without intercept) happens
> 
> 3. CPU has to exit SVM, and start running the host SMI handler, it loads the SMM
>      state without touching the VM_HSAVE_PA runs the SMI handler, then once it RSMs,
>      it restores the guest state from SMM area and continues the guest
> 
> 4. Once a normal VMexit happens, the host state is restored from VM_HSAVE_PA
> 
> So host state indeed can't be saved to VMC01.
> 
> I to be honest think would prefer not to use the L1's hsave area but rather add back our
> 'hsave' in KVM and store there the L1 host state on the nested entry always.
> 
> This way we will avoid touching the vmcb01 at all and both solve the issue and
> reduce code complexity.
> (copying of L1 host state to what basically is L1 guest state area and back
> even has a comment to explain why it (was) possible to do so.
> (before you discovered that this doesn't work with SMM).
> 
> Thanks again for fixing this bug!
> 
> Best regards,
> 	Maxim Levitsky
>