linux-kernel - Re: [PATCH v10 30/39] KVM: selftests: Hyper-V PV TLB flush selftest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YyuVtrpQwZGHs4ez@google.com>
Date:   Wed, 21 Sep 2022 22:52:38 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Michael Kelley <mikelley@...rosoft.com>,
        Siddharth Chandrasekaran <sidcha@...zon.de>,
        Yuan Yao <yuan.yao@...ux.intel.com>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v10 30/39] KVM: selftests: Hyper-V PV TLB flush selftest

On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
> +/* 'Worker' vCPU code checking the contents of the test page */
> +static void worker_guest_code(vm_vaddr_t test_data)
> +{
> +	struct test_data *data = (struct test_data *)test_data;
> +	u32 vcpu_id = rdmsr(HV_X64_MSR_VP_INDEX);
> +	unsigned char chr_exp1, chr_exp2, chr_cur;

Any reason for "unsigned char" over uint8_t?

And the "chr_" prefix is rather weird, IMO it just makes the code harder to read.

Actually, why a single char?  E.g. why not do a uint64_t?  Oooh, because the
offset is only by vcpu_id, not by vcpu_id * PAGE_SIZE.  Maybe add a comment about
that somewhere?

> +
> +	x2apic_enable();
> +	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
> +
> +	for (;;) {
> +		/* Read the expected char, then check what's in the test pages and then
> +		 * check the expectation again to make sure it wasn't updated in the meantime.

Please wrap at the soft limit.

> +		 */

Except for apparently networking, kernel preferred style for block comments is:

		/*
		 * This comment is for KVM.
		 */

> +		chr_exp1 = READ_ONCE(*(unsigned char *)
> +				     (data->test_pages + PAGE_SIZE * NTEST_PAGES + vcpu_id));

Use a local variable for the pointer, then these line lengths are much more sane.
Hmm, and if you give them descriptive names, I think it will make the code much
easier to follow.  E.g. I've been staring at this test for ~10 minutes and am still
not entirely sure what shenanigans are going on.

> +		asm volatile("lfence");

The kernel versions of these are provided by tools/arch/x86/include/asm/barrier.h,
which I think is available?  I forget if we can use those in the selftests mess.

Regardless, this needs a comment explaining why LFENCE/rmb() is needed, and why
the writer needs MFENCE/mb().

> +		chr_cur = *(unsigned char *)data->test_pages;

READ_ONCE()?

> +		asm volatile("lfence");
> +		chr_exp2 = READ_ONCE(*(unsigned char *)
> +				     (data->test_pages + PAGE_SIZE * NTEST_PAGES + vcpu_id));
> +		if (chr_exp1 && chr_exp1 == chr_exp2)

IIUC, the "chr_exp1 != 0" check is the read side of "0 == disable".  Splitting
that out and adding a comment would be helpful.

And if a local variable is used to hold the pointer, there's no need for an "exp2"
variable.

> +			GUEST_ASSERT(chr_cur == chr_exp1);
> +		asm volatile("nop");

Use cpu_relax(), which KVM selftests provide.

All in all, something like this?

	for (;;) {
		cpu_relax();

		expected = READ_ONCE(*this_vcpu);
		
		/* ??? */
		rmb();
		val = READ_ONCE(*???);
		/* ??? */
		rmb();

		/*
		 * '0' indicates the sender is between iterations, wait until
		 * the sender is ready for this vCPU to start checking again.
		 */
		if (!expected)
			continue;

		/*
		 * Re-read the per-vCPU byte to ensure the sender didn't move
		 * onto a new iteration.
		 */	
		if (expected != READ_ONCE(*this_vcpu))
			continue;
		
		GUEST_ASSERT(val == expected);
	}

> +	}
> +}
> +
> +/*
> + * Write per-CPU info indicating what each 'worker' CPU is supposed to see in
> + * test page. '0' means don't check.
> + */
> +static void set_expected_char(void *addr, unsigned char chr, int vcpu_id)
> +{
> +	asm volatile("mfence");

Why MFENCE?

> +	*(unsigned char *)(addr + NTEST_PAGES * PAGE_SIZE + vcpu_id) = chr;
> +}
> +
> +/* Update PTEs swapping two test pages */
> +static void swap_two_test_pages(vm_paddr_t pte_gva1, vm_paddr_t pte_gva2)
> +{
> +	uint64_t pte[2];
> +
> +	pte[0] = *(uint64_t *)pte_gva1;
> +	pte[1] = *(uint64_t *)pte_gva2;
> +
> +	*(uint64_t *)pte_gva1 = pte[1];
> +	*(uint64_t *)pte_gva2 = pte[0];

xchg()?  swap()?

> +}
> +
> +/* Delay */
> +static inline void rep_nop(void)

LOL, rep_nop() is a hilariously confusing function name.  "REP NOP" is "PAUSE",
and for whatever reason the kernel proper use rep_nop() as the function name for
the wrapper.  My reaction to the MFENCE+rep_nop() below was "how the hell does
MFENCE+PAUSE guarantee a delay?!?".

Anyways, why not do e.g. usleep(1)?  And if you really need a udelay() and not a
usleep(), IMO it's worth adding exactly that instead of throwing NOPs at the CPU.
E.g. aarch64 KVM selftests already implements udelay(), so adding an x86 variant
would move us one step closer to being able to use it in common tests.


> +{
> +	int i;
> +
> +	for (i = 0; i < 1000000; i++)
> +		asm volatile("nop");
> +}
> +	r = pthread_create(&threads[0], NULL, vcpu_thread, vcpu[1]);
> +	TEST_ASSERT(r == 0,

!r is preferred

> +		    "pthread_create failed errno=%d", errno);

TEST_ASSERT() already captures errno, e.g. these can be:

	TEST_ASSERT(!r, "pthread_create() failed");

> +
> +	r = pthread_create(&threads[1], NULL, vcpu_thread, vcpu[2]);
> +	TEST_ASSERT(r == 0,
> +		    "pthread_create failed errno=%d", errno);
> +
> +	while (true) {
> +		r = _vcpu_run(vcpu[0]);
> +		exit_reason = vcpu[0]->run->exit_reason;
> +
> +		TEST_ASSERT(!r, "vcpu_run failed: %d\n", r);

Pretty sure newlines in asserts aren't necessary, though I forget if they cause
weirdness or just end up being ignored.

> +		TEST_ASSERT(exit_reason == KVM_EXIT_IO,
> +			    "unexpected exit reason: %u (%s)",
> +			    exit_reason, exit_reason_str(exit_reason));
> +
> +		switch (get_ucall(vcpu[0], &uc)) {
> +		case UCALL_SYNC:
> +			TEST_ASSERT(uc.args[1] == stage,
> +				    "Unexpected stage: %ld (%d expected)\n",
> +				    uc.args[1], stage);
> +			break;
> +		case UCALL_ABORT:
> +			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
> +				  __FILE__, uc.args[1]);

			REPORT_GUEST_ASSERT(uc);