[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tw7txgx9.fsf@vitty.brq.redhat.com>
Date: Fri, 17 Feb 2017 11:14:42 +0100
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Andy Lutomirski <luto@...capital.net>,
"K. Y. Srinivasan" <kys@...rosoft.com>, X86 ML <x86@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Dexuan Cui <decui@...rosoft.com>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
devel@...uxdriverproject.org,
Linux Virtualization <virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH v2 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
Thomas Gleixner <tglx@...utronix.de> writes:
> On Wed, 15 Feb 2017, Vitaly Kuznetsov wrote:
>> Actually, we already have an implementation of TSC page update in KVM
>> (see arch/x86/kvm/hyperv.c, kvm_hv_setup_tsc_page()) and the update does
>> the following:
>>
>> 0) stash seq into seq_prev
>> 1) seq = 0 making all reads from the page invalid
>> 2) smp_wmb()
>> 3) update tsc_scale, tsc_offset
>> 4) smp_wmb()
>> 5) set seq = seq_prev + 1
>
> I hope they handle the case where seq_prev overflows and becomes 0 :)
>
>> As far as I understand this helps with situations you described above as
>> guest will notice either invalid value of 0 or seq change. In case the
>> implementation in real Hyper-V is the same we're safe with compile
>> barriers only.
>
> On x86 that's correct. smp_rmb() resolves to barrier(), but you certainly
> need the smp_wmb() on the writer side.
>
> Now looking at the above your reader side code is bogus:
>
> + while (1) {
> + sequence = tsc_pg->tsc_sequence;
> + if (!sequence)
> + break;
>
> Why would you break out of the loop when seq is 0? The 0 is just telling
> you that there is an update in progress.
Not only. As far as I understand (and I *think* K. Y. pointed this out)
when VM is migrating to another host TSC page clocksource is disabled for
extended period of time so we're better off reading from MSR than
looping here. With regards to VDSO this means reverting to doing normal
syscall.
>
> The Linux seqcount writer side is:
>
> seq++;
> smp_wmb();
>
> update...
>
> smp_wmb();
> seq++;
>
> and it's defined that an odd sequence count, i.e. bit 0 set means update in
> progress. Which is nice, because you don't have to treat 0 special on the
> writer side and you don't need extra storage to stash seq away :)
>
> So the reader side does:
>
> do {
> while (1) {
> s = READ_ONCE(seq);
> if (!(s & 0x01))
> break;
> cpu_relax();
> }
> smp_rmb();
>
> read data ...
>
> smp_rmb();
> } while (s != seq)
>
> So for that hyperv thing you want:
>
> do {
> while (1) {
> s = READ_ONCE(seq);
> if (s)
> break;
> cpu_relax();
> }
> smp_rmb();
>
> read data ...
>
> smp_rmb();
> } while (s != seq)
>
> Thanks,
>
> tglx
--
Vitaly
Powered by blists - more mailing lists