[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0193f93-64cb-4935-b70f-aeee66eb6d44@opensynergy.com>
Date: Wed, 13 Mar 2024 18:50:59 +0100
From: Peter Hilber <peter.hilber@...nsynergy.com>
To: David Woodhouse <dwmw2@...radead.org>,
Alexandre Belloni <alexandre.belloni@...tlin.com>
Cc: linux-kernel@...r.kernel.org, virtualization@...ts.linux.dev,
virtio-dev@...ts.oasis-open.org, linux-arm-kernel@...ts.infradead.org,
linux-rtc@...r.kernel.org,
"virtio-comment@...ts.oasis-open.org" <virtio-comment@...ts.oasis-open.org>,
"Christopher S. Hall" <christopher.s.hall@...el.com>,
Jason Wang <jasowang@...hat.com>, John Stultz <jstultz@...gle.com>,
"Michael S. Tsirkin" <mst@...hat.com>, netdev@...r.kernel.org,
Richard Cochran <richardcochran@...il.com>, Stephen Boyd <sboyd@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, Xuan Zhuo
<xuanzhuo@...ux.alibaba.com>, Marc Zyngier <maz@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Alessandro Zummo <a.zummo@...ertech.it>, "Ridoux, Julien"
<ridouxj@...zon.com>
Subject: Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes
On 13.03.24 15:06, David Woodhouse wrote:
> On Wed, 2024-03-13 at 13:58 +0100, Alexandre Belloni wrote:
>> The TSC or whatever CPU counter/clock that is used to keep the system
>> time is not an RTC, I don't get why it has to be exposed as such to the
>> guests. PTP is fine and precise, RTC is not.
>
> Ah, I see. But the point of the virtio_rtc is not really to expose that
> CPU counter. The point is to report the wallclock time, just like an
> actual RTC. The real difference is the *precision*.
>
> The virtio_rtc device has a facility to *also* expose the counter,
> because that's what we actually need to gain that precision...
>
> Applications don't read the RTC every time they want to know what the
> time is. These days, they don't even make a system call; it's done
> entirely in userspace mode. The kernel exposes some shared memory,
> essentially saying "the counter was X at time Y, and runs at Z Hz".
> Then applications just read the CPU counter and do some arithmetic.
>
> As we require more and more precision in the calibration, it becomes
> important to get *paired* readings of the CPU counter and the wallclock
> time at precisely the same moment. If the guest has to read one and
> then the other, potentially taking interrupts, getting preempted and
> suffering steal/SMI time in the middle, that introduces an error which
> is increasingly significant as we increasingly care about precision.
>
> Peter's proposal exposes the pairs of {X,Y} and leaves *all* the guest
> kernels having to repeat readings over time and perform the calibration
> as the underlying hardware oscillator frequency (Z) drifts with
> temperature. I'm trying to get him to let the hypervisor expose the
> calibrated frequency Z too. Along with *error* bounds for ±δX and ±δZ.
> Which aside from reducing the duplication of effort, will *also* fix
> the problem of live migration where *all* those things suffer a step
> change and leave the guest with an inaccurate clock but not knowing it.
I am already convinced that this would work significantly better than the
{X,Y} pair (but would be a bit more effort to implement):
1. when accessed by user space, obviously
2. when backing the PTP clock, it saves CPU time and makes non-paired
reads more precise.
I would just prefer to try upstreaming the {X,Y} pairing first. I think the
{X,Y,Z...} pairing could be discussed and developed in parallel.
Thanks for the comments,
Peter
Powered by blists - more mailing lists