[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <88c65b89-5174-4076-82cd-7852c8c25b66@intel.com>
Date: Tue, 11 Jun 2024 16:07:12 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: <isaku.yamahata@...el.com>, <pbonzini@...hat.com>,
<erdemaktas@...gle.com>, <vkuznets@...hat.com>, <vannapurve@...gle.com>,
<jmattson@...gle.com>, <mlevitsk@...hat.com>, <xiaoyao.li@...el.com>,
<chao.gao@...el.com>, <rick.p.edgecombe@...el.com>, <yuan.yao@...el.com>,
<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH V8 1/2] KVM: selftests: Add x86_64 guest udelay() utility
Hi Sean,
On 6/11/24 3:03 PM, Sean Christopherson wrote:
> On Tue, Jun 11, 2024, Reinette Chatre wrote:
>>> Heh, the docs are stale. KVM hasn't returned an error since commit cc578287e322
>>> ("KVM: Infrastructure for software and hardware based TSC rate scaling"), which
>>> again predates selftests by many years (6+ in this case). To make our lives
>>> much simpler, I think we should assert that KVM_GET_TSC_KHZ succeeds, and maybe
>>> throw in a GUEST_ASSERT(thz_khz) in udelay()?
>>
>> I added the GUEST_ASSERT() but I find that it comes with a caveat (more below).
>>
>> I plan an assert as below that would end up testing the same as what a
>> GUEST_ASSERT(tsc_khz) would accomplish:
>>
>> r = __vm_ioctl(vm, KVM_GET_TSC_KHZ, NULL);
>> TEST_ASSERT(r > 0, "KVM_GET_TSC_KHZ did not provide a valid TSC freq.");
>> tsc_khz = r;
>>
>>
>> Caveat is: the additional GUEST_ASSERT() requires all tests that use udelay() in
>> the guest to now subtly be required to implement a ucall (UCALL_ABORT) handler.
>> I did a crude grep check to see and of the 69 x86_64 tests there are 47 that do
>> indeed have a UCALL_ABORT handler. If any of the other use udelay() then the
>> GUEST_ASSERT() will of course still trigger, but will be quite cryptic. For
>> example, "Unhandled exception '0xe' at guest RIP '0x0'" vs. "tsc_khz".
>
> Yeah, we really need to add a bit more infrastructure, there is way, way, waaaay
> too much boilerplate needed just to run a guest and handle the basic ucalls.
> Reporting guest asserts should Just Work for 99.9% of tests.
>
> Anyways, is it any less cryptic if ucall_assert() forces a failure? I forget if
> the problem with an unhandled GUEST_ASSERT() is that the test re-enters the guest,
> or if it's something else.
>
> I don't think we need a perfect solution _now_, as tsc_khz really should never
> be 0, just something to not make life completely miserable for future developers.
>
> diff --git a/tools/testing/selftests/kvm/lib/ucall_common.c b/tools/testing/selftests/kvm/lib/ucall_common.c
> index 42151e571953..1116bce5cdbf 100644
> --- a/tools/testing/selftests/kvm/lib/ucall_common.c
> +++ b/tools/testing/selftests/kvm/lib/ucall_common.c
> @@ -98,6 +98,8 @@ void ucall_assert(uint64_t cmd, const char *exp, const char *file,
>
> ucall_arch_do_ucall((vm_vaddr_t)uc->hva);
>
> + ucall_arch_do_ucall(GUEST_UCALL_FAILED);
> +
> ucall_free(uc);
> }
>
Thank you very much.
With your suggestion an example unhandled GUEST_ASSERT() looks as below.
It does not guide on what (beyond vcpu_run()) triggered the assert but it
indeed provides a hint that adding ucall handling may be needed.
[SNIP]
==== Test Assertion Failure ====
lib/ucall_common.c:154: addr != (void *)GUEST_UCALL_FAILED
pid=16002 tid=16002 errno=4 - Interrupted system call
1 0x000000000040da91: get_ucall at ucall_common.c:154
2 0x0000000000410142: assert_on_unhandled_exception at processor.c:614
3 0x0000000000406590: _vcpu_run at kvm_util.c:1718
4 (inlined by) vcpu_run at kvm_util.c:1729
5 0x00000000004026cf: test_apic_bus_clock at apic_bus_clock_test.c:115
6 (inlined by) run_apic_bus_clock_test at apic_bus_clock_test.c:164
7 (inlined by) main at apic_bus_clock_test.c:201
8 0x00007fb1d8429d8f: ?? ??:0
9 0x00007fb1d8429e3f: ?? ??:0
10 0x00000000004027a4: _start at ??:?
Guest failed to allocate ucall struct
[SNIP]
Is this acceptable? I can add a new preparatory patch with your
suggestion that has as its goal to provide slightly better error message
when there is an unhandled ucall.
Reinette
Powered by blists - more mailing lists