lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN6PR1101MB216162F44664713802201FAFA80C9@BN6PR1101MB2161.namprd11.prod.outlook.com>
Date:   Wed, 23 Nov 2022 08:31:52 +0000
From:   "Li, Xin3" <xin3.li@...el.com>
To:     "Christopherson,, Sean" <seanjc@...gle.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "Tian, Kevin" <kevin.tian@...el.com>
Subject: RE: [RESEND PATCH 5/6] KVM: x86/VMX: add kvm_vmx_reinject_nmi_irq()
 for NMI/IRQ reinjection

> > > > > > > > And yes, the current code appears to suffer the same defect.
> > > >
> > > > That defect isn't going to be fixed simply by changing how KVM
> > > > forwards NMIs though.  IIUC, _everything_ between VM-Exit and the
> > > > invocation of the NMI handler needs to be noinstr.  On VM-Exit due
> > > > to NMI, NMIs are blocked.  If a #BP/#DB/#PF occurs before KVM gets
> > > > to kvm_x86_handle_exit_irqoff(), the subsequent IRET will unblock
> > > > NMIs before the original NMI is serviced, i.e. a second NMI could
> > > > come in at anytime regardless of how KVM forwards the NMI to the
> kernel.
> > > >
> > > > Is there any way to solve this without tagging everything noinstr?
> > > > There is a metric shit ton of code between VM-Exit and the
> > > > handling of NMIs, and much of that code is common helpers.  It
> > > > might be possible to hoist NMI handler much earlier, though we'd
> > > > need to do a super thorough audit to ensure all necessary host state is
> restored.
> > >
> > > As NMI is the only vector with this potential issue, it sounds a
> > > good idea to only promote its handling.
> > >
> >
> > Hi Peter/Sean,
> >
> > I prefer to move _everything_ between VM-Exit and the invocation of
> > the NMI handler into the noinstr section in the next patch set, how do you
> think?
> 
> That's likely going to be beyond painful and will have a _lot_ of collateral
> damage in the sense that other paths will end up calling noinstr function just
> because of VMX.  E.g. hw_breakpoint_restore(),
> fpu_sync_guest_vmexit_xfd_state(),
> kvm_get_apic_mode(), multiple static calls in KVM... the list goes on and on
> and on.
> 
> The ongoing maintenance for that would also be quite painful.

This is very complicated and I'm lost after reading the code around it.

Maybe some comments would help to explain the steps and order after VM exit.
(of course some people know it quite well)

> 
> Actually, SVM already enables NMIs far earlier, which means the probability of
> breaking something by moving NMI handling earlier is lower.  Not zero, as I
> don't trust that SVM gets all the corner right, but definitely low.
> 
> E.g. this should be doable

Do we need to recover *all* host states before switching to NMI stack and
handler?  if not what is the minimal set? If we restore the minimal set and
do "int $2" as early as possible, is it a quick and dirty approach?

> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> cea8c07f5229..2fec93f38960 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7249,6 +7249,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu
> *vcpu)
>         if (unlikely(vmx->exit_reason.failed_vmentry))
>                 return EXIT_FASTPATH_NONE;
> 
> +       <handle NMIs here>
> +
>         vmx->loaded_vmcs->launched = 1;
> 
>         vmx_recover_nmi_blocking(vmx);
> 
> 
> thouh we'd like want a fair bit of refactoring so that all of vmx_vcpu_run() and
> svm_vcpu_run() don't need to be noinstr.

This sounds reasonable to me, however from Documentation/core-api/entry.rst,
we do need it. Maybe we could argue guest is logically orthogonal to host,
and the CPU is borrowed to another OS... (which also kind of explains nested) 

> 
> Another wart that needs to be addressed is trace_kvm_exit().  IIRC, tracepoints
> must be outside of noinstr, though maybe I'm misremembering that.  And
> conversely, SVM doesn't trace exits that are handled in the fastpath.  Best
> option is probably to move VMX's trace_kvm_exit() call to vmx_handle_exit(),
> and then figure out a common way to trace exits that are handled in the
> fastpath (which can reside outside of the noinstr section).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ