lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZTrj1CRKLOVbcytz@google.com>
Date:   Thu, 26 Oct 2023 22:10:28 +0000
From:   Mingwei Zhang <mizhang@...gle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Jinrong Liang <cloudliang@...cent.com>,
        Like Xu <likexu@...cent.com>
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural
 events on gp counters

On Thu, Oct 26, 2023, Sean Christopherson wrote:
> On Thu, Oct 26, 2023, Mingwei Zhang wrote:
> > > +static bool pmu_is_intel_event_stable(uint8_t idx)
> > > +{
> > > +	switch (idx) {
> > > +	case INTEL_ARCH_CPU_CYCLES:
> > > +	case INTEL_ARCH_INSTRUCTIONS_RETIRED:
> > > +	case INTEL_ARCH_REFERENCE_CYCLES:
> > > +	case INTEL_ARCH_BRANCHES_RETIRED:
> > > +		return true;
> > > +	default:
> > > +		return false;
> > > +	}
> > > +}
> > 
> > Brief explanation on why other events are not stable please. Since there
> > are only a few architecture events, maybe listing all of them with
> > explanation in comments would work better.
> 
> Heh, I've already rewritten this logic to make 
> 
> 
> > > +
> > > +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> > > +				 uint32_t counter_msr, uint32_t nr_gp_counters)
> > > +{
> > > +	uint8_t idx = event.f.bit;
> > > +	unsigned int i;
> > > +
> > > +	for (i = 0; i < nr_gp_counters; i++) {
> > > +		wrmsr(counter_msr + i, 0);
> > > +		wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > > +		      ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> > > +		__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> > 
> > Some comment might be needed for readability. Abuptly inserting inline
> > assembly code in C destroys the readability.
> > 
> > I wonder do we need add 'clobber' here for the above line, since it
> > takes away ecx?
> 
> It's already there.  You can't directly clobber a register that is used as an
> input constraint.  The workaround is to make the register both an input and an
> output, hense the "+c" in the outputs section instead of just "c" in the inputs
> section.  The extra bit of cleverness is to use an intermediate anonymous variable
> so that NUM_BRANCHES can effectively be passed in (#defines won't work as output
> constraints).
> 
> > Also, I wonder if we need to disable IRQ here? This code might be
> > intercepted and resumed. If so, then the test will get a different
> > number?
> 
> This is guest code, disabling IRQs is pointless.  There are no guest virtual IRQs,
> guarding aginst host IRQs is impossible, unnecessary, and actualy undesirable,
> i.e. the guest vPMU shouldn't be counting host instructions and whatnot.
> 
> > > +
> > > +		if (pmu_is_intel_event_stable(idx))
> > > +			GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
> > 
> > Okay, just the counter value is non-zero means we pass the test ?!
> 
> FWIW, I've updated 
> 
> > hmm, I wonder other than IRQ stuff, what else may affect the result? NMI
> > watchdog or what?
> 
> This is the beauty of selftests.  There _so_ simple that there are very few
> surprises.  E.g. there are no events of any kind unless the test explicitly
> generates them.  The downside is that doing anything complex in selftests requires
> writing a fair bit of code.

Understood, so we could support precise matching.
>
> > > +
> > > +		wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > > +		      !ARCH_PERFMON_EVENTSEL_ENABLE |
> > > +		      intel_pmu_arch_events[idx]);
> > > +		wrmsr(counter_msr + i, 0);
> > > +		__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> > ditto for readability. Please consider using a macro to avoid repeated
> > explanation.
> 
> Heh, already did this too.  Though I'm not entirely sure it's more readable.  It's
> definitely more precise and featured :-)
> 
Oh dear, this is challenging to my rusty inline assembly skills :)

> #define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP)				\
> do {										\
> 	__asm__ __volatile__("wrmsr\n\t"					\
> 			     clflush "\n\t"					\
> 			     "mfence\n\t"					\
> 			     "1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t"	\
> 			     FEP "loop .\n\t"					\
> 			     FEP "mov %%edi, %%ecx\n\t"				\
> 			     FEP "xor %%eax, %%eax\n\t"				\
> 			     FEP "xor %%edx, %%edx\n\t"				\
> 			     "wrmsr\n\t"					\
> 			     : "+c"((int){_msr})				\
isn't it NUM_BRANCHES?
> 			     : "a"((uint32_t)_value), "d"(_value >> 32),	\
> 			       "D"(_msr)					\
> 	);									\
> } while (0)
>

do we need this label '1:' in the above code? It does not seems to be
used anywhere within the code.

why is clflush needed here?
> 
> > > +int main(int argc, char *argv[])
> > > +{
> > > +	TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> > > +
> > > +	TEST_REQUIRE(host_cpu_is_intel);
> > > +	TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> > > +	TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> > > +	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
> > 
> > hmm, this means we cannot run this in nested if X86_FEATURE_PDCM is
> > missing. It only affects full-width counter, right?
> 
> Ah, yeah, good call.  It won't be too much trouble to have the test play nice
> with !PDCM.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ