[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211207014106.GB32145@shbuild999.sh.intel.com>
Date: Tue, 7 Dec 2021 09:41:06 +0800
From: Feng Tang <feng.tang@...el.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...el.com>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
linux-kernel@...r.kernel.org, rui.zhang@...el.com,
andi.kleen@...el.com, len.brown@...el.com, tim.c.chen@...el.com
Subject: Re: [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified
platforms
Hi Paul,
On Tue, Nov 30, 2021 at 08:28:15AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 30, 2021 at 11:02:56PM +0800, Feng Tang wrote:
> > And similar big gap between 'tsc' and 'hpet' is seen for the server
> > case (5.5 kernel which doesn't have the cs_watchdog_read() patchset).
> >
> > [1196945.314929] clocksource: timekeeping watchdog on CPU67: Marking clocksource 'tsc' as unstable because the skew is too large:
> > [1196945.314935] clocksource: 'hpet' wd_now: 25272026 wd_last: 2e9ce418 mask: ffffffff
> > [1196945.314938] clocksource: 'tsc' cs_now: 95b400003fdf1 cs_last: 95ae7ed7c33f7 mask: ffffffffffffffff
> > [1196945.314948] tsc: Marking TSC unstable due to clocksource watchdog
> > [1196945.314977] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
> > [1196945.314981] sched_clock: Marking unstable (1196945264804527, 50153181)<-(1196945399926576, -84962703)
> > [1196945.316255] clocksource: Switched to clocksource hpet
> >
> > For this case, I don't have access to the HW and only have the
> > dmesg log, from which it seems the watchdog timer has been postponed
> > a very long time from running.
>
> Thank you for the analysis!
>
> One approach to handle this situation would be to avoid checking for
> clock skew if the time since the last watchdog read was more than (say)
> twice the desired watchdog spacing. This does leave open the question of
> exactly which clocksource to use to measure the time between successive
> clocksource reads. My thought is to check this only once upon entry to
> the handler and to use the designated-good clocksource.
>
> Does that make sense, or would something else work better?
For this case that the watchdog timer has been delayed for too long
time (170 seconds here), it may be a general problem. IIRC, there
was a similar report in LKML for a non-x86 platform.
As for fix, I thought about scalable comparing, say if the timer
is delayed 10 seconds, and our checking interval is 500 ms, then
maybe we can lift the checking margin to 20X. But this has a problem
that the watchdog's counter could wrap, in above case, the HPET
already wrapped once (about 170+ seconds), and the wrap time
could be much shorter for other timers (4 seconds for acpi_pm timer?).
So your idea of limiting the max delay is reasonable.
Thanks,
Feng
> Thanx, Paul
Powered by blists - more mailing lists