[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1710051943570.2398@nanos>
Date: Thu, 5 Oct 2017 20:01:16 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Gabriel Beddingfield <gabe@...tlabs.com>
cc: LKML <linux-kernel@...r.kernel.org>,
Stephen Boyd <sboyd@...eaurora.org>,
John Stultz <john.stultz@...aro.org>,
Alessandro Zummo <a.zummo@...ertech.it>,
Alexandre Belloni <alexandre.belloni@...e-electrons.com>,
linux-rtc@...r.kernel.org, Guy Erb <guy@...tlabs.com>,
Howard Harte <hharte@...tlabs.com>
Subject: Re: Extreme time jitter with suspend/resume cycles
Gabriel,
On Thu, 5 Oct 2017, Gabriel Beddingfield wrote:
> On Thu, Oct 5, 2017 at 4:01 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> > i.e. the 32bit rollover of the clocksource. So, if the clocksource->read()
> > function returns a full 64bit counter value, then it must have protection
> > against observing the rollover independent of the clock which feeds that
> > counter. Of course the frequency changes the probablity of observing it,
> > but still the read function must be protected against observing the
> > rollover unconditionally.
>
> Right, but isn't this what clocksource->mask is supposed to do? When we change
> the back-end frequency, we're still using the same front-end 32-bit register and
> we don't see the same jumps.
Right. That's what the mask should protect. I was assuming that this is one
of the fancy clocksources which expose two 32bit registers of a 64bit
counter and the rollover protection was missing. So that's not the
case. Good, or not so good :)
> > Which SoC/clocksource driver are you talking about?
>
> NXP i.MX 6SoloX
> drivers/clocksource/timer-imx-gpt.c
So that clocksource driver looks correct. Do you have an idea in which
context this time jump happens? Does it happen when you exercise your high
frequency suspend/resume dance or is that happening just when you let the
machine run forever as well?
The timekeeping_resume() path definitely has an issue:
cycle_now = tk_clock_read(&tk->tkr_mono);
if ((clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP) &&
cycle_now > tk->tkr_mono.cycle_last) {
This works nice for clocksources which wont wrap across suspend/resume but
not for those which can. That cycle_now -> cycle_last check should take
cs-mask into account ...
Of course for clocksources which can wrap within realistic suspend times,
which 36 hours might be accounted for, this would need an extra sanity
check against a RTC whether wrap time has been exceeded.
I haven't thought it through whether that buggered check fully explains
what you are observing, but it's wrong nevertheless. John?
Thanks,
tglx
Powered by blists - more mailing lists