[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <665f749e-b71e-a793-d759-87f7cf89677c@bytedance.com>
Date: Sat, 9 Oct 2021 11:22:48 +0800
From: yanghui <yanghui.def@...edance.com>
To: John Stultz <john.stultz@...aro.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Stephen Boyd <sboyd@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH] Clocksource: Avoid misjudgment of
clocksource
在 2021/10/9 上午7:45, John Stultz 写道:
> On Fri, Oct 8, 2021 at 1:03 AM yanghui <yanghui.def@...edance.com> wrote:
>>
>> clocksource_watchdog is executed every WATCHDOG_INTERVAL(0.5s) by
>> Timer. But sometimes system is very busy and the Timer cannot be
>> executed in 0.5sec. For example,if clocksource_watchdog be executed
>> after 10sec, the calculated value of abs(cs_nsec - wd_nsec) will
>> be enlarged. Then the current clocksource will be misjudged as
>> unstable. So we add conditions to prevent the clocksource from
>> being misjudged.
>>
>> Signed-off-by: yanghui <yanghui.def@...edance.com>
>> ---
>> kernel/time/clocksource.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
>> index b8a14d2fb5ba..d535beadcbc8 100644
>> --- a/kernel/time/clocksource.c
>> +++ b/kernel/time/clocksource.c
>> @@ -136,8 +136,10 @@ static void __clocksource_change_rating(struct clocksource *cs, int rating);
>>
>> /*
>> * Interval: 0.5sec.
>> + * MaxInterval: 1s.
>> */
>> #define WATCHDOG_INTERVAL (HZ >> 1)
>> +#define WATCHDOG_MAX_INTERVAL_NS (NSEC_PER_SEC)
>>
>> static void clocksource_watchdog_work(struct work_struct *work)
>> {
>> @@ -404,7 +406,9 @@ static void clocksource_watchdog(struct timer_list *unused)
>>
>> /* Check the deviation from the watchdog clocksource. */
>> md = cs->uncertainty_margin + watchdog->uncertainty_margin;
>> - if (abs(cs_nsec - wd_nsec) > md) {
>> + if ((abs(cs_nsec - wd_nsec) > md) &&
>> + cs_nsec < WATCHDOG_MAX_INTERVAL_NS &&
>
> Sorry, it's been awhile since I looked at this code, but why are you
> bounding the clocksource delta here?
> It seems like if the clocksource being watched was very wrong (with a
> delta larger than the MAX_INTERVAL_NS), we'd want to throw it out.
>
>> + wd_nsec < WATCHDOG_MAX_INTERVAL_NS) {
>
> Bounding the watchdog interval on the check does seem reasonable.
> Though one may want to keep track that if we are seeing too many of
> these delayed watchdog checks we provide some feedback via dmesg.
Yes, only to check watchdog delta is more reasonable.
I think Only have dmesg is not enough, because if tsc was be misjudged
as unstable then switch to hpet. And hpet is very expensive for
performance, so if we want to switch to tsc the only way is to reboot
the server. We need to prevent the switching of the clock source in
case of misjudgment.
Circumstances of misjudgment:
if clocksource_watchdog is executed after 10sec, the value of wd_delta
and cs_delta also be about 10sec, also the value of (cs_nsec- wd_nsec)
will be magnified 20 times(10sec/0.5sec).The delta value is magnified.
But now clocksource is accurate, the Timer is inaccurate. So we
misjudged the clocksource. We need avoid this from happening.
thanks
-Hui
>
> thanks
> -john
>
Powered by blists - more mailing lists