[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b476b348-e803-46c2-a068-26a694019d4d@linux.intel.com>
Date: Thu, 11 Sep 2025 09:55:02 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Jim Mattson <jmattson@...gle.com>,
Mingwei Zhang <mizhang@...gle.com>, Zide Chen <zide.chen@...el.com>,
Das Sandipan <Sandipan.Das@....com>, Shukla Manali <Manali.Shukla@....com>,
Yi Lai <yi1.lai@...el.com>, Dapeng Mi <dapeng1.mi@...el.com>,
dongsheng <dongsheng.x.zhang@...el.com>
Subject: Re: [PATCH v2 4/5] KVM: selftests: Relax precise event count
validation as overcount issue
On 9/11/2025 7:56 AM, Sean Christopherson wrote:
> On Fri, Jul 18, 2025, Dapeng Mi wrote:
>> From: dongsheng <dongsheng.x.zhang@...el.com>
>>
>> For Intel Atom CPUs, the PMU events "Instruction Retired" or
>> "Branch Instruction Retired" may be overcounted for some certain
>> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
>> and complex SGX/SMX/CSTATE instructions/flows.
>>
>> The detailed information can be found in the errata (section SRF7):
>> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
>>
>> For the Atom platforms before Sierra Forest (including Sierra Forest),
>> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
>> be overcounted on these certain instructions, but for Clearwater Forest
>> only "Instruction Retired" event is overcounted on these instructions.
>>
>> As the overcount issue on VM-Exit/VM-Entry, it has no way to validate
>> the precise count for these 2 events on these affected Atom platforms,
>> so just relax the precise event count check for these 2 events on these
>> Atom platforms.
>>
>> Signed-off-by: dongsheng <dongsheng.x.zhang@...el.com>
>> Co-developed-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>> Tested-by: Yi Lai <yi1.lai@...el.com>
>> ---
> ...
>
>> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
>> index 342a72420177..074cdf323406 100644
>> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
>> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
>> @@ -52,6 +52,9 @@ struct kvm_intel_pmu_event {
>> struct kvm_x86_pmu_feature fixed_event;
>> };
>>
>> +
>> +static uint8_t inst_overcount_flags;
>> +
>> /*
>> * Wrap the array to appease the compiler, as the macros used to construct each
>> * kvm_x86_pmu_feature use syntax that's only valid in function scope, and the
>> @@ -163,10 +166,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
>>
>> switch (idx) {
>> case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
>> - GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
>> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
>> + if (inst_overcount_flags & INST_RETIRED_OVERCOUNT)
>> + GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
>> + else
>> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
>> break;
>> case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
>> - GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
>> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
>> + if (inst_overcount_flags & BR_RETIRED_OVERCOUNT)
>> + GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
>> + else
>> + GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
>> break;
>> case INTEL_ARCH_LLC_REFERENCES_INDEX:
>> case INTEL_ARCH_LLC_MISSES_INDEX:
>> @@ -335,6 +346,7 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
>> length);
>> vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
>> unavailable_mask);
>> + sync_global_to_guest(vm, inst_overcount_flags);
> Rather than force individual tests to sync_global_to_guest(), and to cache the
> value, I think it makes sense to handle this automatically in kvm_arch_vm_post_create(),
> similar to things like host_cpu_is_intel and host_cpu_is_amd.
Yeah, that is the better place.
>
> And explicitly call these out as errata, so that it's super clear that we're
> working around PMU/CPU flaws, not KVM bugs. With some shenanigans, we can even
> reuse the this_pmu_has()/this_cpu_has(0 terminology as this_pmu_has_errata(), and
> hide the use of a bitmask too.
Agree.
>
> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> index d4f90f5ec5b8..046d992c5940 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> @@ -163,10 +163,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
>
> switch (idx) {
> case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (this_pmu_has_errata(INSTRUCTIONS_RETIRED_OVERCOUNT))
> + GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> break;
> case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT))
> + GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> break;
> case INTEL_ARCH_LLC_REFERENCES_INDEX:
> case INTEL_ARCH_LLC_MISSES_INDEX:
> diff --git a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
> index c15513cd74d1..1c5b7611db24 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
> @@ -214,8 +214,10 @@ static void remove_event(struct __kvm_pmu_event_filter *f, uint64_t event)
> do { \
> uint64_t br = pmc_results.branches_retired; \
> uint64_t ir = pmc_results.instructions_retired; \
> + bool br_matched = this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT) ? \
> + br >= NUM_BRANCHES : br == NUM_BRANCHES; \
> \
> - if (br && br != NUM_BRANCHES) \
> + if (br && !br_matched) \
> pr_info("%s: Branch instructions retired = %lu (expected %u)\n", \
> __func__, br, NUM_BRANCHES); \
> TEST_ASSERT(br, "%s: Branch instructions retired = %lu (expected > 0)", \
Looks good to me.
Powered by blists - more mailing lists