linux-kernel - Re: [PATCH v2 3/6] KVM: x86: selftests: Set up AMD VM in pmu_counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <gsntplkh2wia.fsf@coltonlewis-kvm.c.googlers.com>
Date: Mon, 20 Jan 2025 18:03:25 +0000
From: Colton Lewis <coltonlewis@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, mizhang@...gle.com, ljr.kernel@...il.com, 
	jmattson@...gle.com, aaronlewis@...gle.com, pbonzini@...hat.com, 
	shuah@...nel.org, linux-kselftest@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/6] KVM: x86: selftests: Set up AMD VM in pmu_counters_test

Hey Sean,

Thanks for the review.

Sean Christopherson <seanjc@...gle.com> writes:

> On Wed, Sep 18, 2024, Colton Lewis wrote:
>> Branch in main() depending on if the CPU is Intel or AMD. They are
>> subject to vastly different requirements because the AMD PMU lacks
>> many properties defined by the Intel PMU including the entire CPUID
>> 0xa function where Intel stores all the PMU properties. AMD lacks this
>> as well as any consistent notion of PMU versions as Intel does. Every
>> feature is a separate flag and they aren't the same features as Intel.

>> Set up a VM for testing core AMD counters and ensure proper CPUID
>> features are set.

>> Signed-off-by: Colton Lewis <coltonlewis@...gle.com>
>> ---
>>   .../selftests/kvm/x86_64/pmu_counters_test.c  | 104 ++++++++++++++----
>>   1 file changed, 83 insertions(+), 21 deletions(-)

>> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c  
>> b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>> index 0e305e43a93b..5b240585edc5 100644
>> --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>> @@ -30,10 +30,21 @@
>>   #define NUM_INSNS_RETIRED		(NUM_LOOPS * NUM_INSNS_PER_LOOP +  
>> NUM_EXTRA_INSNS)


>> +/*
>> + * Limit testing to MSRs that are actually defined by Intel (in the  
>> SDM).  MSRs
>> + * that aren't defined counter MSRs *probably* don't exist, but there's  
>> no
>> + * guarantee that currently undefined MSR indices won't be used for  
>> something
>> + * other than PMCs in the future.
>> + */
>> +#define MAX_NR_GP_COUNTERS	8
>> +#define MAX_NR_FIXED_COUNTERS	3
>> +#define AMD_NR_CORE_COUNTERS	4
>> +#define AMD_NR_CORE_EXT_COUNTERS	6
>> +
>>   static uint8_t kvm_pmu_version;
>>   static bool kvm_has_perf_caps;

>> -static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu  
>> **vcpu,
>> +static struct kvm_vm *intel_pmu_vm_create(struct kvm_vcpu **vcpu,
>>   						  void *guest_code,

> When renaming things, please fixup the alignment as needed.  Yes, it's  
> more
> churn, but one-time pain is preferable to living indefinitely with funky  
> formatting.

Understood

> I also don't like renaming just one symbol.  E.g. the above  
> MAX_NR_GP_COUNTERS
> and MAX_NR_FIXED_COUNTERS #defines are Intel specific, but that's not at  
> all
> clear from the code.  Ditto for guest_rd_wr_counters() vs.
> guest_test_rdwr_core_counters().

> Given how little code is actually shared between Intel and AMD, I think  
> it makes
> sense to have the bulk of the code live in separate .c files.  Since
> tools/testing/selftests/kvm/lib/x86/pmu.c is already a thing, the best  
> option is
> probably to rename pmu_counters_test.c to intel_pmu_counters_test.c, and  
> then
> extract the common bits to lib/x86/pmu.c (or include/x86/pmu.h as  
> appropriate).

Yeah I see what you mean about making processor-specific functions more
obvious and think separating to different files would help a lot with
that.

>>   						  uint8_t pmu_version,
>>   						  uint64_t perf_capabilities)
>> +static void test_core_counters(void)
>> +{
>> +	uint8_t nr_counters = nr_core_counters();
>> +	bool core_ext = kvm_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE);
>> +	bool perfmon_v2 = kvm_cpu_has(X86_FEATURE_PERFMON_V2);
>> +	struct kvm_vcpu *vcpu;
>> +	struct kvm_vm *vm;
>> +
>> +	for (uint8_t ce = 0; ce <= core_ext; ce++) {

> Kernel style is to not declared variables inside for-loops.

I ran it through checkpatch and it didn't complain.

>> +		for (uint8_t pm = 0; pm <= perfmon_v2; pm++) {

> Iterating over booleans is decidedly odd, the indentation levels are  
> painful and
> will only get worse as more features are added, and the "ce" and "pm"  
> variables
> aren't all that intuitive.  More below.
>> +			for (uint8_t nc = 0; nc <= nr_counters; nc++) {

> I also find "nc" to be unintuitive.  Either use a fully descriptive name,  
> or
> make it obvious that the variables is an iterator.  E.g. either

> 	uint8_t max_nr_counters = nr_core_counters();

> 	...

> 		for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {


> or

> 		for (j = 0; j < nr_counters; j++) {


> 'j' is obviously not descriptive, but when reading the usage, it's more  
> obvious
> that it's a loop iterator (if you choose

Ok I'll go with the generic loop iterator name.

>> +				vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);
>> +
>> +				if (nc)

> Is '0' not a legal number of counters?

AMD64 Architecture Programmers Manual Volume 2 Chapter 13.2.2 states
that only nonzero values for that property are meaningful.

With 0, you are supposed to assume either 4 or 6 counters depending on
the CoreExt feature.

I could make 0 mean 0 counters inside the guest but that would break
what hardware does.

>> +					vcpu_set_cpuid_property(

> Google3!  (Never, ever wrap immediately after the opening paranethesis).

Checkpatch didn't complain.

>> +						vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE, nc);
>> +				if (ce)
>> +					vcpu_set_cpuid_feature(
>> +						vcpu, X86_FEATURE_PERF_CTR_EXT_CORE);

> This likely doesn't do what you want.  By default, vm_arch_vcpu_add()  
> initializes
> CPUID to KVM's supported CPUID.  So only _setting_ the feature means that  
> that
> the test is likely only ever running with the full set of supported  
> features.

Ok, so I have to explicitly unset the feature if I don't want it.

> Jumping back to my complaints with the for-loops, if the features to  
> interate on
> are collected in an array, then the test can generate a mask of all  
> possible
> combinations and iterate over that (plus the array itself).  That keeps  
> the
> indentation bounded and eliminates the copy+paste needed to add a new  
> feature.
> The only downside is that the test is limited to 64 features, but we'll  
> run into
> run time issues long before that limit is reached.

> 	const struct kvm_x86_cpu_feature pmu_features[] = {
> 		X86_FEATURE_PERF_CTR_EXT_CORE,
> 		X86_FEATURE_PERFMON_V2,
> 	};

> 	const u64 pmu_features_mask = BIT_ULL(ARRAY_SIZE(pmu_features)) - 1;

> 	for (mask = 0; mask <= pmu_features_mask; mask++) {
> 		for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {
> 			vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);

> 			vcpu_set_cpuid_property(vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE,
> 						nr_counters);

> 			/* Comment goes here */
> 			for (i = 0; i < ARRAY_SIZE(pmu_features); i++)
> 				vcpu_set_or_clear_cpuid_feature(vcpu, pmu_features[i],
> 								mask & BIT_ULL(i));

> 			...
> 	}

I thought of putting the features in a bitmask but worried it obscured
the intent too much. Listing features in an array and constructing the
bitmask seems clear enough to me so I will use that technique now.

>> +				if (pm)
>> +					vcpu_set_cpuid_feature(
>> +						vcpu, X86_FEATURE_PERFMON_V2);
>> +
>> +				pr_info("Testing core counters: CoreExt = %u, PerfMonV2 = %u,  
>> NumCounters = %u\n",
>> +					ce, pm, nc);
>> +				run_vcpu(vcpu);
>> +
>> +				kvm_vm_free(vm);
>> +			}
>> +		}
>> +	}
>> +}