linux-kernel - Re: [PATCH v7 03/19] KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural encodings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZUvi6P7iKMtsS8wm@google.com>
Date:   Wed, 8 Nov 2023 11:35:04 -0800
From:   Sean Christopherson <seanjc@...gle.com>
To:     Kan Liang <kan.liang@...ux.intel.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Dapeng Mi <dapeng1.mi@...ux.intel.com>,
        Jim Mattson <jmattson@...gle.com>,
        Jinrong Liang <cloudliang@...cent.com>,
        Aaron Lewis <aaronlewis@...gle.com>,
        Like Xu <likexu@...cent.com>
Subject: Re: [PATCH v7 03/19] KVM: x86/pmu: Remove KVM's enumeration of
 Intel's architectural encodings

On Wed, Nov 08, 2023, Kan Liang wrote:
> On 2023-11-07 7:31 p.m., Sean Christopherson wrote:
> > @@ -442,8 +396,29 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >  	return 0;
> >  }
> >  
> > +/*
> > + * Map fixed counter events to architectural general purpose event encodings.
> > + * Perf doesn't provide APIs to allow KVM to directly program a fixed counter,
> > + * and so KVM instead programs the architectural event to effectively request
> > + * the fixed counter.  Perf isn't guaranteed to use a fixed counter and may
> > + * instead program the encoding into a general purpose counter, e.g. if a
> > + * different perf_event is already utilizing the requested counter, but the end
> > + * result is the same (ignoring the fact that using a general purpose counter
> > + * will likely exacerbate counter contention).
> > + *
> > + * Note, reference cycles is counted using a perf-defined "psuedo-encoding",
> > + * as there is no architectural general purpose encoding for reference cycles.
> 
> It's not the case for the latest Intel platforms anymore. Please see
> ffbe4ab0beda ("perf/x86/intel: Extend the ref-cycles event to GP counters").

Ugh, yeah.  But that and should actually be easier to do on top.

> Maybe perf should export .event_map to KVM somehow.

Oh for ***** sake, perf already does export this for KVM.  Untested, but the below
should do the trick.  If I need to spin another version of this series then I'll
fold it in, otherwise I'll post it as something on top.

There's also an optimization to be had for kvm_pmu_trigger_event(), which incurs
an indirect branch not only every invocation, but on every iteration.  I'll post
this one separately.

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 5fc5a62af428..a02e13c2e5e6 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -405,25 +405,32 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  * different perf_event is already utilizing the requested counter, but the end
  * result is the same (ignoring the fact that using a general purpose counter
  * will likely exacerbate counter contention).
- *
- * Note, reference cycles is counted using a perf-defined "psuedo-encoding",
- * as there is no architectural general purpose encoding for reference cycles.
  */
 static u64 intel_get_fixed_pmc_eventsel(int index)
 {
-       const struct {
-               u8 eventsel;
-               u8 unit_mask;
-       } fixed_pmc_events[] = {
-               [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */
-               [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */
-               [2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/
+       enum perf_hw_id perf_id;
+       u64 eventsel;
+
+       BUILD_BUG_ON(KVM_PMC_MAX_FIXED != 3);
+
+       switch (index) {
+       case 0:
+               perf_id = PERF_COUNT_HW_INSTRUCTIONS;
+               break;
+       case 1:
+               perf_id = PERF_COUNT_HW_CPU_CYCLES;
+               break;
+       case 2:
+               perf_id = PERF_COUNT_HW_REF_CPU_CYCLES;
+               break;
+       default:
+               WARN_ON_ONCE(1);
+               return 0;
        };
 
-       BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);
-
-       return (fixed_pmc_events[index].unit_mask << 8) |
-               fixed_pmc_events[index].eventsel;
+       eventsel = perf_get_hw_event_config(perf_id);
+       WARN_ON_ONCE(!eventsel && index < kvm_pmu_cap.num_counters_fixed);
+       return eventsel;
 }
 
 static void intel_pmu_refresh(struct kvm_vcpu *vcpu)