lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h6is7agi.fsf@redhat.com>
Date: Thu, 01 Feb 2024 11:19:57 +0100
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Jan Richter <jarichte@...hat.com>, linux-kernel@...r.kernel.org,
 kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>, Sean
 Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH] KVM: selftests: Compare wall time from xen shinfo
 against KVM_GET_CLOCK

David Woodhouse <dwmw2@...radead.org> writes:

> Sorry for delayed response.
>
> On Thu, 2024-01-11 at 14:59 +0100, Vitaly Kuznetsov wrote:
>> 
>> +       vm_ioctl(vm, KVM_GET_CLOCK, &kcdata);
>> +       delta = (wc->sec * NSEC_PER_SEC + wc->nsec) - (kcdata.realtime - kcdata.clock);
>
> I think you need to check for KVM_CLOCK_REALTIME in the flags here,
> don't you? It might not always be set.

Good suggestion; while this shouldn't be a common use-case on x86_64, it
is possible when TSC is *not* the clocksource used on the host. I guess
we can just skip the sub-test with a message in this case.

>
> And also, nobody should ever be using KVM_CLOCK_REALTIME as it stands
> at the moment. It's fundamentally broken because it should always have
> used CLOCK_TAI not CLOCK_REALTIME.
>
> I'm in the process of fixing that up as an incremental ABI change,
> putting the TAI offset into one of the spare pad fields in the
> kvm_clock_data and adding a new KVM_CLOCK_TAI_OFFSET flag.
>
>> +       TEST_ASSERT(llabs(delta) < 100,
>> +                   "Guest's epoch from shinfo %d.%09d differs from KVM_GET_CLOCK %lld.%lld",
>> +                   wc->sec, wc->nsec, (kcdata.realtime - kcdata.clock) / NSEC_PER_SEC,
>> +                   (kcdata.realtime - kcdata.clock) % NSEC_PER_SEC);
>> 
>
>> Replace the check with comparing wall clock data from shinfo to
>> KVM_GET_CLOCK. The later gives both realtime and kvmclock so guest's epoch
>> can be calculated by subtraction. Note, the computed epoch may still differ
>> a few nanoseconds from shinfo as different TSC is used and there are
>> rounding errors but 100 nanoseconds margin should be enough to cover
>> it (famous last words).
>
> Aren't those just bugs? Surely this *should* be cycle-perfect, if the
> host has a stable TSC?
>
> But I suppose this isn't the test case for that. This is just testing
> that the Xen shinfo is doing basically the right thing.
>
> And we're going to need a few more fixes like this WIP before we get
> get to cycle perfection from the horrid mess that is our kvmclock code:
> https://git.infradead.org/?p=users/dwmw2/linux.git;a=commitdiff;h=bc557e5ee4

(Some time ago I was considering suggesting we switch to Hypet-V TSC page
clocksource for Linux guests to avoid fixing the mess :-) It's becoming
less and less relevant as with modern hardware which supports TSC
scaling (and which doesn't support famous TSC bugs like TSC divergence
across sockets :-) passing through raw TSC to guests is becoming more
and more popular.)

Regarding this fix, what's your decree? My intention here is to
basically get rid on xen_shinfo flakyness without reducing test
coverage. I.e. we still want to test that shinfo data is somewhat
sane. Later, when you get your 'cycle perfection' and 'TAI' patches in,
we can always adjust the test accordingly. In case you agree, I can
re-send this with KVM_CLOCK_REALTIME check added.

-- 
Vitaly


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ