lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191114163444.GD24045@linux.intel.com>
Date:   Thu, 14 Nov 2019 08:34:45 -0800
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Wanpeng Li <kernellwp@...il.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org,
        Radim Krčmář <rkrcmar@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>
Subject: Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath

On Thu, Nov 14, 2019 at 04:44:33PM +0100, Paolo Bonzini wrote:
> On 14/11/19 16:22, Sean Christopherson wrote:
> >> Instead of a separate vcpu->fast_vmexit, perhaps you can set exit_reason
> >> to vmx->exit_reason to -1 if the fast path succeeds.
> > 
> > Actually, rather than make this super special case, what about moving the
> > handling into vmx_handle_exit_irqoff()?  Practically speaking that would
> > only add ~50 cycles (two VMREADs) relative to the code being run right
> > after kvm_put_guest_xcr0().  It has the advantage of restoring the host's
> > hardware breakpoints, preserving a semi-accurate last_guest_tsc, and
> > running with vcpu->mode set back to OUTSIDE_GUEST_MODE.  Hopefully it'd
> > also be more intuitive for people unfamiliar with the code.
> 
> Yes, that's a good idea.  The expensive bit between handle_exit_irqoff
> and handle_exit is srcu_read_lock, which has two memory barriers in it.

Preaching to the choir at this point, but it'd also eliminate latency
spikes due to interrupts.

> >>> +			if (ret == 0)
> >>> +				ret = kvm_skip_emulated_instruction(vcpu);
> >> Please move the "kvm_skip_emulated_instruction(vcpu)" to
> >> vmx_handle_exit, so that this basically is
> >>
> >> #define EXIT_REASON_NEED_SKIP_EMULATED_INSN -1
> >>
> >> 	if (ret == 0)
> >> 		vcpu->exit_reason = EXIT_REASON_NEED_SKIP_EMULATED_INSN;
> >>
> >> and handle_ipi_fastpath can return void.
> >
> > I'd rather we add a dedicated variable to say the exit has already been
> > handled.  Overloading exit_reason is bound to cause confusion, and that's
> > probably a best case scenario.
> 
> I proposed the fake exit reason to avoid a ternary return code from
> handle_ipi_fastpath (return to guest, return to userspace, call
> kvm_x86_ops->handle_exit), which Wanpeng's patch was mishandling.

For this case, I think we can get away with a WARN if kvm_lapic_reg_write()
fails since it (currently) can't fail for ICR.  That would allow us to keep
a void return for ->handle_exit_irqoff() and avoid an overloaded return
value.

And, if the fastpath is used for all ICR writes, not just FIXED+PHYSICAL,
and is implemented for SVM, then we don't even need a a flag, e.g.
kvm_x2apic_msr_write() can simply ignore ICR writes, similar to how
handle_exception() ignores #MC and NMIs.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 87b0fcc23ef8..d7b79f7faac1 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2615,12 +2615,11 @@ int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data)
        if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(apic))
                return 1;

-       if (reg == APIC_ICR2)
+
+       /* ICR2 writes are ignored and ICR writes are handled early. */
+       if (reg == APIC_ICR2 || reg == APIC_ICR)
                return 1;

-       /* if this is ICR write vector before command */
-       if (reg == APIC_ICR)
-               kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(data >> 32));
        return kvm_lapic_reg_write(apic, reg, (u32)data);
 }

Another bonus to this approach is that the refactoring for the
exit_reason can be done in a separate series.

> To ensure confusion does not become the best case scenario, perhaps it
> is worth trying to push exit_reason into vcpu_enter_guest's stack.
> vcpu_enter_guest can pass a pointer to it, and then it can be passed
> back into kvm_x86_ops->handle_exit{,_irqoff}.  It could be a struct too,
> instead of just a bare u32.

On the other hand, if it's a bare u32 then kvm_x86_ops->run can simply
return the value instead of doing out parameter shenanigans.

> This would ensure at compile-time that exit_reason is not accessed
> outside the short path from vmexit to kvm_x86_ops->handle_exit.

That would be nice.  Surprisingly, the code actually appears to be fairly
clean, I thought for sure the nested stuff would be using exit_reason all
over the place.  The only one that needs to be fixed is handle_vmfunc(),
which passes vmx->exit_reason when forwarding the VM-Exit instead of simply
hardcoding EXIT_REASON_VMFUNC.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ