[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d528515c-4c74-a948-1db5-cba77d692ef3@bytedance.com>
Date: Thu, 14 Oct 2021 15:03:36 +0800
From: yanghui <yanghui.def@...edance.com>
To: brookxu <brookxu.cn@...il.com>,
John Stultz <john.stultz@...aro.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Stephen Boyd <sboyd@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Clocksource: Avoid misjudgment of clocksource
在 2021/10/12 下午4:06, brookxu 写道:
>
>
> John Stultz wrote on 2021/10/12 13:29:
>> On Mon, Oct 11, 2021 at 10:23 PM brookxu <brookxu.cn@...il.com> wrote:
>>> John Stultz wrote on 2021/10/12 12:52 下午:
>>>> On Sat, Oct 9, 2021 at 7:04 AM brookxu <brookxu.cn@...il.com> wrote:
>>>>>
>>>>> hello
>>>>>
>>>>> John Stultz wrote on 2021/10/9 7:45:
>>>>>> On Fri, Oct 8, 2021 at 1:03 AM yanghui <yanghui.def@...edance.com> wrote:
>>>>>>>
>>>>>>> clocksource_watchdog is executed every WATCHDOG_INTERVAL(0.5s) by
>>>>>>> Timer. But sometimes system is very busy and the Timer cannot be
>>>>>>> executed in 0.5sec. For example,if clocksource_watchdog be executed
>>>>>>> after 10sec, the calculated value of abs(cs_nsec - wd_nsec) will
>>>>>>> be enlarged. Then the current clocksource will be misjudged as
>>>>>>> unstable. So we add conditions to prevent the clocksource from
>>>>>>> being misjudged.
>>>>>>>
>>>>>>> Signed-off-by: yanghui <yanghui.def@...edance.com>
>>>>>>> ---
>>>>>>> kernel/time/clocksource.c | 6 +++++-
>>>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
>>>>>>> index b8a14d2fb5ba..d535beadcbc8 100644
>>>>>>> --- a/kernel/time/clocksource.c
>>>>>>> +++ b/kernel/time/clocksource.c
>>>>>>> @@ -136,8 +136,10 @@ static void __clocksource_change_rating(struct clocksource *cs, int rating);
>>>>>>>
>>>>>>> /*
>>>>>>> * Interval: 0.5sec.
>>>>>>> + * MaxInterval: 1s.
>>>>>>> */
>>>>>>> #define WATCHDOG_INTERVAL (HZ >> 1)
>>>>>>> +#define WATCHDOG_MAX_INTERVAL_NS (NSEC_PER_SEC)
>>>>>>>
>>>>>>> static void clocksource_watchdog_work(struct work_struct *work)
>>>>>>> {
>>>>>>> @@ -404,7 +406,9 @@ static void clocksource_watchdog(struct timer_list *unused)
>>>>>>>
>>>>>>> /* Check the deviation from the watchdog clocksource. */
>>>>>>> md = cs->uncertainty_margin + watchdog->uncertainty_margin;
>>>>>>> - if (abs(cs_nsec - wd_nsec) > md) {
>>>>>>> + if ((abs(cs_nsec - wd_nsec) > md) &&
>>>>>>> + cs_nsec < WATCHDOG_MAX_INTERVAL_NS &&
>>>>>>
>>>>>> Sorry, it's been awhile since I looked at this code, but why are you
>>>>>> bounding the clocksource delta here?
>>>>>> It seems like if the clocksource being watched was very wrong (with a
>>>>>> delta larger than the MAX_INTERVAL_NS), we'd want to throw it out.
>>>>>>
>>>>>>> + wd_nsec < WATCHDOG_MAX_INTERVAL_NS) {
>>>>>>
>>>>>> Bounding the watchdog interval on the check does seem reasonable.
>>>>>> Though one may want to keep track that if we are seeing too many of
>>>>>> these delayed watchdog checks we provide some feedback via dmesg.
>>>>>
>>>>> For some fast timeout timers, such as acpi-timer, checking wd_nsec should not
>>>>> make much sense, because when wacthdog is called, the timer may overflow many
>>>>> times.
>>>>
>>>> Indeed. But in that case we can't tell which way is up. This is what I
>>>> was fretting about when I said:
>>>>> So I do worry these watchdog robustness fixes are papering over a
>>>>> problem, pushing expectations closer to the edge of how far the system
>>>>> should tolerate bad behavior. Because at some point we'll fall off. :)
>>>>
>>>> If the timer is delayed long enough for the watchdog to wrap, we're
>>>> way out of tolerable behavior. There's not much we can do because we
>>>> can't even tell what happened.
>>>>
>>>> But in the case where the watchdog has not wrapped, I don't see a
>>>> major issue with trying to be a bit more robust in the face of just
>>>> slightly delayed timers.
>>>> (And yes, we can't really distinguish between slightly delayed and
>>>> watchdog-wrap-interval + slight delay, but in either case we can
>>>> probably skip disqualifying the clocksource as we know something seems
>>>> off)
>>>
>>> If we record the watchdog's start_time in clocksource_start_watchdog(), and then
>>> when we verify cycles in clocksource_watchdog(), check whether the clocksource
>>> watchdog is blocked. Due to MSB verification, if the blocked time is greater than
>>> half of the watchdog timer max_cycles, then we can safely ignore the current
>>> verification? Do you think this idea is okay?
>>
>> I can't say I totally understand the idea. Maybe could you clarify with a patch?
>>
>
> Sorry, it looks almost as follows:
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index b8a14d2..87f3b67 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -119,6 +119,7 @@
> static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
> static DEFINE_SPINLOCK(watchdog_lock);
> static int watchdog_running;
> +static unsigned long watchdog_start_time;
> static atomic_t watchdog_reset_pending;
>
> static inline void clocksource_watchdog_lock(unsigned long *flags)
> @@ -356,6 +357,7 @@ static void clocksource_watchdog(struct timer_list *unused)
> int next_cpu, reset_pending;
> int64_t wd_nsec, cs_nsec;
> struct clocksource *cs;
> + unsigned long max_jiffies;
> u32 md;
>
> spin_lock(&watchdog_lock);
> @@ -402,6 +404,10 @@ static void clocksource_watchdog(struct timer_list *unused)
> if (atomic_read(&watchdog_reset_pending))
> continue;
>
> + max_jiffies = nsecs_to_jiffies(cs->max_idle_ns);
> + if (time_is_before_jiffies(watchdog_start_time + max_jiffies))
> + continue;
> +
> /* Check the deviation from the watchdog clocksource. */
> md = cs->uncertainty_margin + watchdog->uncertainty_margin;
> if (abs(cs_nsec - wd_nsec) > md) {
> @@ -474,6 +480,7 @@ static void clocksource_watchdog(struct timer_list *unused)
> * pair clocksource_stop_watchdog() clocksource_start_watchdog().
> */
> if (!timer_pending(&watchdog_timer)) {
> + watchdog_start_time = jiffies;
> watchdog_timer.expires += WATCHDOG_INTERVAL;
> add_timer_on(&watchdog_timer, next_cpu);
> }
> @@ -488,6 +495,7 @@ static inline void clocksource_start_watchdog(void)
> timer_setup(&watchdog_timer, clocksource_watchdog, 0);
> watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
> add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask));
> + watchdog_start_time = jiffies;
> watchdog_running = 1;
> }
>
>
>> thanks
>> -john
>>
It looks good to me.
thanks.
Powered by blists - more mailing lists