linux-kernel - Re: [PATCH v2 3/5] KVM: Selftests: Validate more arch-events in pmu_counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <727fbc0a-7edf-46a3-abaf-90ef32615b04@linux.intel.com>
Date: Thu, 11 Sep 2025 09:41:57 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
 linux-kernel@...r.kernel.org, Jim Mattson <jmattson@...gle.com>,
 Mingwei Zhang <mizhang@...gle.com>, Zide Chen <zide.chen@...el.com>,
 Das Sandipan <Sandipan.Das@....com>, Shukla Manali <Manali.Shukla@....com>,
 Yi Lai <yi1.lai@...el.com>, Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [PATCH v2 3/5] KVM: Selftests: Validate more arch-events in
 pmu_counters_test


On 9/11/2025 7:51 AM, Sean Christopherson wrote:
> On Fri, Jul 18, 2025, Dapeng Mi wrote:
>> Clearwater Forest introduces 5 new architectural events (4 topdown
>> level 1 metrics events and LBR inserts event). This patch supports
>> to validate these 5 newly added events. The detailed info about these
>> 5 events can be found in SDM section 21.2.7 "Pre-defined Architectural
>>  Performance Events".
>>
>> It becomes unrealistic to traverse all possible combinations of
>> unavailable events mask (may need dozens of minutes to finish all
>> possible combination validation). So only limit unavailable events mask
>> traverse to the first 8 arch-events.
> Split these into separate patches.  Buring a meaningful change like this in a big
> patch that seemingly just adds architectural collateral is pure evil.
>> @@ -612,15 +620,19 @@ static void test_intel_counters(void)
>>  			pr_info("Testing arch events, PMU version %u, perf_caps = %lx\n",
>>  				v, perf_caps[i]);
>>  			/*
>> -			 * To keep the total runtime reasonable, test every
>> -			 * possible non-zero, non-reserved bitmap combination
>> -			 * only with the native PMU version and the full bit
>> -			 * vector length.
>> +			 * To keep the total runtime reasonable, especially after
>> +			 * the total number of arch-events increasing to 13, It's
>> +			 * impossible to test every possible non-zero, non-reserved
>> +			 * bitmap combination. Only test the first 8-bits combination
>> +			 * with the native PMU version and the full bit vector length.
>>  			 */
>>  			if (v == pmu_version) {
>> -				for (k = 1; k < (BIT(NR_INTEL_ARCH_EVENTS) - 1); k++)
>> +				int max_events = min(NR_INTEL_ARCH_EVENTS, 8);
> Too arbitrary, and worse, bad coverage.  And honestly, even iterating over 255
> (or 512?) different values is a waste of time.  Ha!  And test_arch_events() is
> buggy, it takes unavailable_mask as u8 instead of a u32.  I'll slot in a patch
> to fix that.
>
> As for the runtime, I think it's time to throw in the towel in terms of brute
> forcing the validation space, and just test a handful of hopefully-interesting
> values, e.g.
>
> ---
>  .../selftests/kvm/x86/pmu_counters_test.c     | 38 +++++++++++--------
>  1 file changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> index cfeed0103341..09ad68675576 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> @@ -577,6 +577,26 @@ static void test_intel_counters(void)
>  		PMU_CAP_FW_WRITES,
>  	};
>  
> +	/*
> +	 * To keep the total runtime reasonable, test only a handful of select,
> +	 * semi-arbitrary values for the mask of unavailable PMU events.  Test
> +	 * 0 (all events available) and all ones (no events available) as well
> +	 * as alternating bit sequencues, e.g. to detect if KVM is checking the
> +	 * wrong bit(s).
> +	 */
> +	const uint32_t unavailable_masks[] = {
> +		0x0,
> +		0xffffffffu,
> +		0xf0f0f0f0u,
> +		0x0f0f0f0fu,
> +		0xaaaaaaaau,
> +		0xa0a0a0a0u,
> +		0x0a0a0a0au,
> +		0x55555555u,
> +		0x50505050u,
> +		0x05050505u,
> +	};

Looks good to me. Just a minor suggestion, better to move 0x55555555u
closely before and after 0xaaaaaaaau since they are the 2 complementary
items. This makes the sequences more easily understood. Thanks.


> +
>  	/*
>  	 * Test up to PMU v5, which is the current maximum version defined by
>  	 * Intel, i.e. is the last version that is guaranteed to be backwards
> @@ -614,16 +634,7 @@ static void test_intel_counters(void)
>  
>  			pr_info("Testing arch events, PMU version %u, perf_caps = %lx\n",
>  				v, perf_caps[i]);
> -			/*
> -			 * To keep the total runtime reasonable, test every
> -			 * possible non-zero, non-reserved bitmap combination
> -			 * only with the native PMU version and the full bit
> -			 * vector length.
> -			 */
> -			if (v == pmu_version) {
> -				for (k = 1; k < (BIT(NR_INTEL_ARCH_EVENTS) - 1); k++)
> -					test_arch_events(v, perf_caps[i], NR_INTEL_ARCH_EVENTS, k);
> -			}
> +
>  			/*
>  			 * Test single bits for all PMU version and lengths up
>  			 * the number of events +1 (to verify KVM doesn't do
> @@ -632,11 +643,8 @@ static void test_intel_counters(void)
>  			 * ones i.e. all events being available and unavailable.
>  			 */
>  			for (j = 0; j <= NR_INTEL_ARCH_EVENTS + 1; j++) {
> -				test_arch_events(v, perf_caps[i], j, 0);
> -				test_arch_events(v, perf_caps[i], j, -1u);
> -
> -				for (k = 0; k < NR_INTEL_ARCH_EVENTS; k++)
> -					test_arch_events(v, perf_caps[i], j, BIT(k));
> +				for (k = 1; k < ARRAY_SIZE(unavailable_masks); k++)
> +					test_arch_events(v, perf_caps[i], j, unavailable_masks[k]);
>  			}
>  
>  			pr_info("Testing GP counters, PMU version %u, perf_caps = %lx\n",