linux-kernel - Re: [PATCH v3] KVM: VMX: Enable Notify VM exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALMp9eS6cBDuax8O=woSdkNH2e2Y2EodE-7EfUTFfzBvCWCmcg@mail.gmail.com>
Date:   Fri, 25 Feb 2022 20:53:48 -0800
From:   Jim Mattson <jmattson@...gle.com>
To:     Xiaoyao Li <xiaoyao.li@...el.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Chenyi Qiang <chenyi.qiang@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] KVM: VMX: Enable Notify VM exit

On Fri, Feb 25, 2022 at 8:25 PM Jim Mattson <jmattson@...gle.com> wrote:
>
> On Fri, Feb 25, 2022 at 8:07 PM Xiaoyao Li <xiaoyao.li@...el.com> wrote:
> >
> > On 2/25/2022 11:13 PM, Paolo Bonzini wrote:
> > > On 2/25/22 16:12, Xiaoyao Li wrote:
> > >>>>>
> > >>>>
> > >>>> I don't like the idea of making things up without notifying userspace
> > >>>> that this is fictional. How is my customer running nested VMs supposed
> > >>>> to know that L2 didn't actually shutdown, but L0 killed it because the
> > >>>> notify window was exceeded? If this information isn't reported to
> > >>>> userspace, I have no way of getting the information to the customer.
> > >>>
> > >>> Then, maybe a dedicated software define VM exit for it instead of
> > >>> reusing triple fault?
> > >>>
> > >>
> > >> Second thought, we can even just return Notify VM exit to L1 to tell
> > >> L2 causes Notify VM exit, even thought Notify VM exit is not exposed
> > >> to L1.
> > >
> > > That might cause NULL pointer dereferences or other nasty occurrences.
> >
> > IMO, a well written VMM (in L1) should handle it correctly.
> >
> > L0 KVM reports no Notify VM Exit support to L1, so L1 runs without
> > setting Notify VM exit. If a L2 causes notify_vm_exit with
> > invalid_vm_context, L0 just reflects it to L1. In L1's view, there is no
> > support of Notify VM Exit from VMX MSR capability. Following L1 handler
> > is possible:
> >
> > a)      if (notify_vm_exit available & notify_vm_exit enabled) {
> >                 handle in b)
> >         } else {
> >                 report unexpected vm exit reason to userspace;
> >         }
> >
> > b)      similar handler like we implement in KVM:
> >         if (!vm_context_invalid)
> >                 re-enter guest;
> >         else
> >                 report to userspace;
> >
> > c)      no Notify VM Exit related code (e.g. old KVM), it's treated as
> > unsupported exit reason
> >
> > As long as it belongs to any case above, I think L1 can handle it
> > correctly. Any nasty occurrence should be caused by incorrect handler in
> > L1 VMM, in my opinion.
>
> Please test some common hypervisors (e.g. ESXi and Hyper-V).

I took a look at KVM in Linux v4.9 (one of our more popular guests),
and it will not handle this case well:

        if (exit_reason < kvm_vmx_max_exit_handlers
            && kvm_vmx_exit_handlers[exit_reason])
                return kvm_vmx_exit_handlers[exit_reason](vcpu);
        else {
                WARN_ONCE(1, "vmx: unexpected exit reason 0x%x\n", exit_reason);
                kvm_queue_exception(vcpu, UD_VECTOR);
                return 1;
        }

At least there's an L1 kernel log message for the first unexpected
NOTIFY VM-exit, but after that, there is silence. Just a completely
inexplicable #UD in L2, assuming that L2 is resumable at this point.