linux-kernel - Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f10b1eb8-db53-42e4-85ba-f38560724ae1@gmail.com>
Date:   Wed, 25 Oct 2023 11:17:20 +0800
From:   JinrongLiang <ljr.kernel@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Like Xu <likexu@...cent.com>,
        Jinrong Liang <cloudliang@...cent.com>
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural
 events on gp counters

在 2023/10/25 03:49, Sean Christopherson 写道:
> On Mon, Oct 23, 2023, Sean Christopherson wrote:
>> +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
>> +				 uint32_t counter_msr, uint32_t nr_gp_counters)
>> +{
>> +	uint8_t idx = event.f.bit;
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < nr_gp_counters; i++) {
>> +		wrmsr(counter_msr + i, 0);
>> +		wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
>> +		      ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
>> +		__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>> +
>> +		if (pmu_is_intel_event_stable(idx))
>> +			GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
>> +
>> +		wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
>> +		      !ARCH_PERFMON_EVENTSEL_ENABLE |
>> +		      intel_pmu_arch_events[idx]);
>> +		wrmsr(counter_msr + i, 0);
>> +		__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>> +
>> +		if (pmu_is_intel_event_stable(idx))
>> +			GUEST_ASSERT(!_rdpmc(i));
>> +	}
>> +
>> +	GUEST_DONE();
>> +}
>> +
>> +static void guest_measure_loop(uint8_t idx)
>> +{
>> +	const struct {
>> +		struct kvm_x86_pmu_feature gp_event;
>> +	} intel_event_to_feature[] = {
>> +		[INTEL_ARCH_CPU_CYCLES]		   = { X86_PMU_FEATURE_CPU_CYCLES },
>> +		[INTEL_ARCH_INSTRUCTIONS_RETIRED]  = { X86_PMU_FEATURE_INSNS_RETIRED },
>> +		[INTEL_ARCH_REFERENCE_CYCLES]	   = { X86_PMU_FEATURE_REFERENCE_CYCLES },
>> +		[INTEL_ARCH_LLC_REFERENCES]	   = { X86_PMU_FEATURE_LLC_REFERENCES },
>> +		[INTEL_ARCH_LLC_MISSES]		   = { X86_PMU_FEATURE_LLC_MISSES },
>> +		[INTEL_ARCH_BRANCHES_RETIRED]	   = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
>> +		[INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
>> +	};
>> +
>> +	uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
>> +	uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
>> +	struct kvm_x86_pmu_feature gp_event;
>> +	uint32_t counter_msr;
>> +	unsigned int i;
>> +
>> +	if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
>> +		counter_msr = MSR_IA32_PMC0;
>> +	else
>> +		counter_msr = MSR_IA32_PERFCTR0;
>> +
>> +	gp_event = intel_event_to_feature[idx].gp_event;
>> +	TEST_ASSERT_EQ(idx, gp_event.f.bit);
>> +
>> +	if (pmu_version < 2) {
>> +		guest_measure_pmu_v1(gp_event, counter_msr, nr_gp_counters);
> 
> Looking at this again, testing guest PMU version 1 is practically impossible
> because this testcase doesn't force the guest PMU version.  I.e. unless I'm
> missing something, this requires old hardware or running in a VM with its PMU
> forced to '1'.
> 
> And if all subtests use similar inputs, the common configuration can be shoved
> into pmu_vm_create_with_one_vcpu().
> 
> It's easy enough to fold test_intel_arch_events() into test_intel_counters(),
> which will also provide coverage for running with full-width writes enabled.  The
> only downside is that the total runtime will be longer.
> 
>> +static void test_arch_events_cpuid(uint8_t i, uint8_t j, uint8_t idx)
>> +{
>> +	uint8_t arch_events_unavailable_mask = BIT_ULL(j);
>> +	uint8_t arch_events_bitmap_size = BIT_ULL(i);
>> +	struct kvm_vcpu *vcpu;
>> +	struct kvm_vm *vm;
>> +
>> +	vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_measure_loop);
>> +
>> +	vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
>> +				arch_events_bitmap_size);
>> +	vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
>> +				arch_events_unavailable_mask);
>> +
>> +	vcpu_args_set(vcpu, 1, idx);
>> +
>> +	run_vcpu(vcpu);
>> +
>> +	kvm_vm_free(vm);
>> +}
>> +
>> +static void test_intel_arch_events(void)
>> +{
>> +	uint8_t idx, i, j;
>> +
>> +	for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {
> 
> There's no need to iterate over each event in the host, we can simply add a wrapper
> for guest_measure_loop() in the guest.  That'll be slightly faster since it won't
> require creating and destroying a VM for every event.
> 
>> +		/*
>> +		 * A brute force iteration of all combinations of values is
>> +		 * likely to exhaust the limit of the single-threaded thread
>> +		 * fd nums, so it's test by iterating through all valid
>> +		 * single-bit values.
>> +		 */
>> +		for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
> 
> This is flawed/odd.  'i' becomes arch_events_bitmap_size, i.e. it's a length,
> but the length is computed byt BIT(i).  That's nonsensical and will eventually
> result in undefined behavior.  Oof, that'll actually happen sooner than later
> because arch_events_bitmap_size is only a single byte, i.e. when the number of
> events hits 9, this will try to shove 256 into an 8-bit variable.
> 
> The more correct approach would be to pass in 0..NR_INTEL_ARCH_EVENTS inclusive
> as the size.  But I think we should actually test 0..length+1, where "length" is
> the max of the native length and NR_INTEL_ARCH_EVENTS, i.e. we should verify KVM
> KVM handles a size larger than the native length.
> 
>> +			for (j = 0; j < NR_INTEL_ARCH_EVENTS; j++)
>> +				test_arch_events_cpuid(i, j, idx);
> 
> And here, I think it makes sense to brute force all possible values for at least
> one configuration.  There aren't actually _that_ many values, e.g. currently it's
> 64 (I think).  E.g. test the native PMU version with the "full" length, and then
> test single bits with varying lengths.
> 
> I'll send a v6 later this week.

Got it, thanks.

Please feel free to let me know if there's anything you'd like me to do.