[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200519145756.GC317569@hirez.programming.kicks-ass.net>
Date: Tue, 19 May 2020 16:57:56 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Xu, Like" <like.xu@...el.com>
Cc: Like Xu <like.xu@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Sean Christopherson <sean.j.christopherson@...el.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>, ak@...ux.intel.com,
wei.w.wang@...el.com
Subject: Re: [PATCH v11 10/11] KVM: x86/pmu: Check guest LBR availability in
case host reclaims them
On Tue, May 19, 2020 at 09:10:58PM +0800, Xu, Like wrote:
> On 2020/5/19 19:15, Peter Zijlstra wrote:
> > On Thu, May 14, 2020 at 04:30:53PM +0800, Like Xu wrote:
> >
> > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > > index ea4faae56473..db185dca903d 100644
> > > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > > @@ -646,6 +646,43 @@ static void intel_pmu_lbr_cleanup(struct kvm_vcpu *vcpu)
> > > intel_pmu_free_lbr_event(vcpu);
> > > }
> > > +static bool intel_pmu_lbr_is_availabile(struct kvm_vcpu *vcpu)
> > > +{
> > > + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> > > +
> > > + if (!pmu->lbr_event)
> > > + return false;
> > > +
> > > + if (event_is_oncpu(pmu->lbr_event)) {
> > > + intel_pmu_intercept_lbr_msrs(vcpu, false);
> > > + } else {
> > > + intel_pmu_intercept_lbr_msrs(vcpu, true);
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> > This is unreadable gunk, what?
>
> Abstractly, it is saying "KVM would passthrough the LBR satck MSRs if
> event_is_oncpu() is true, otherwise cancel the passthrough state if any."
>
> I'm using 'event->oncpu != -1' to represent the guest LBR event
> is scheduled on rather than 'event->state == PERF_EVENT_STATE_ERROR'.
>
> For intel_pmu_intercept_lbr_msrs(), false means to passthrough the LBR stack
> MSRs to the vCPU, and true means to cancel the passthrough state and make
> LBR MSR accesses trapped by the KVM.
To me it seems very weird to change state in a function that is supposed
to just query state.
'is_available' seems to suggest a simple: return 'lbr_event->state ==
PERF_EVENT_STATE_ACTIVE' or something.
> > > +static void intel_pmu_availability_check(struct kvm_vcpu *vcpu)
> > > +{
> > > + lockdep_assert_irqs_disabled();
> > > +
> > > + if (lbr_is_enabled(vcpu) && !intel_pmu_lbr_is_availabile(vcpu) &&
> > > + (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR))
> > > + pr_warn_ratelimited("kvm: vcpu-%d: LBR is temporarily unavailable.\n",
> > > + vcpu->vcpu_id);
> > More unreadable nonsense; when the events go into ERROR state, it's a
> > permanent fail, they'll not come back.
> It's not true. The guest LBR event with 'ERROR state' or 'oncpu != -1'
> would be
> lazy released and re-created in the next time the
> intel_pmu_create_lbr_event() is
> called and it's supposed to be re-scheduled and re-do availability_check()
> as well.
Where? Also, wth would you need to destroy and re-create an event for
that?
> > > @@ -6696,8 +6696,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > > pt_guest_enter(vmx);
> > > - if (vcpu_to_pmu(vcpu)->version)
> > > + if (vcpu_to_pmu(vcpu)->version) {
> > > atomic_switch_perf_msrs(vmx);
> > > + kvm_x86_ops.pmu_ops->availability_check(vcpu);
> > > + }
> > AFAICT you just did a call out to the kvm_pmu crud in
> > atomic_switch_perf_msrs(), why do another call?
> In fact, availability_check() is only called here for just one time.
>
> The callchain looks like:
> - vmx_vcpu_run()
> - kvm_x86_ops.pmu_ops->availability_check();
> - intel_pmu_availability_check()
> - intel_pmu_lbr_is_availabile()
> - event_is_oncpu() ...
>
What I'm saying is that you just did a pmu_ops indirect call in
atomic_switch_perf_msrs(), why add another?
Powered by blists - more mailing lists