lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Oct 2021 15:03:36 +0800
From:   yanghui <yanghui.def@...edance.com>
To:     brookxu <brookxu.cn@...il.com>,
        John Stultz <john.stultz@...aro.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Clocksource: Avoid misjudgment of clocksource



在 2021/10/12 下午4:06, brookxu 写道:
> 
> 
> John Stultz wrote on 2021/10/12 13:29:
>> On Mon, Oct 11, 2021 at 10:23 PM brookxu <brookxu.cn@...il.com> wrote:
>>> John Stultz wrote on 2021/10/12 12:52 下午:
>>>> On Sat, Oct 9, 2021 at 7:04 AM brookxu <brookxu.cn@...il.com> wrote:
>>>>>
>>>>> hello
>>>>>
>>>>> John Stultz wrote on 2021/10/9 7:45:
>>>>>> On Fri, Oct 8, 2021 at 1:03 AM yanghui <yanghui.def@...edance.com> wrote:
>>>>>>>
>>>>>>> clocksource_watchdog is executed every WATCHDOG_INTERVAL(0.5s) by
>>>>>>> Timer. But sometimes system is very busy and the Timer cannot be
>>>>>>> executed in 0.5sec. For example,if clocksource_watchdog be executed
>>>>>>> after 10sec, the calculated value of abs(cs_nsec - wd_nsec) will
>>>>>>> be enlarged. Then the current clocksource will be misjudged as
>>>>>>> unstable. So we add conditions to prevent the clocksource from
>>>>>>> being misjudged.
>>>>>>>
>>>>>>> Signed-off-by: yanghui <yanghui.def@...edance.com>
>>>>>>> ---
>>>>>>>   kernel/time/clocksource.c | 6 +++++-
>>>>>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
>>>>>>> index b8a14d2fb5ba..d535beadcbc8 100644
>>>>>>> --- a/kernel/time/clocksource.c
>>>>>>> +++ b/kernel/time/clocksource.c
>>>>>>> @@ -136,8 +136,10 @@ static void __clocksource_change_rating(struct clocksource *cs, int rating);
>>>>>>>
>>>>>>>   /*
>>>>>>>    * Interval: 0.5sec.
>>>>>>> + * MaxInterval: 1s.
>>>>>>>    */
>>>>>>>   #define WATCHDOG_INTERVAL (HZ >> 1)
>>>>>>> +#define WATCHDOG_MAX_INTERVAL_NS (NSEC_PER_SEC)
>>>>>>>
>>>>>>>   static void clocksource_watchdog_work(struct work_struct *work)
>>>>>>>   {
>>>>>>> @@ -404,7 +406,9 @@ static void clocksource_watchdog(struct timer_list *unused)
>>>>>>>
>>>>>>>                  /* Check the deviation from the watchdog clocksource. */
>>>>>>>                  md = cs->uncertainty_margin + watchdog->uncertainty_margin;
>>>>>>> -               if (abs(cs_nsec - wd_nsec) > md) {
>>>>>>> +               if ((abs(cs_nsec - wd_nsec) > md) &&
>>>>>>> +                       cs_nsec < WATCHDOG_MAX_INTERVAL_NS &&
>>>>>>
>>>>>> Sorry, it's been awhile since I looked at this code, but why are you
>>>>>> bounding the clocksource delta here?
>>>>>> It seems like if the clocksource being watched was very wrong (with a
>>>>>> delta larger than the MAX_INTERVAL_NS), we'd want to throw it out.
>>>>>>
>>>>>>> +                       wd_nsec < WATCHDOG_MAX_INTERVAL_NS) {
>>>>>>
>>>>>> Bounding the watchdog interval on the check does seem reasonable.
>>>>>> Though one may want to keep track that if we are seeing too many of
>>>>>> these delayed watchdog checks we provide some feedback via dmesg.
>>>>>
>>>>> For some fast timeout timers, such as acpi-timer, checking wd_nsec should not
>>>>> make much sense, because when wacthdog is called, the timer may overflow many
>>>>> times.
>>>>
>>>> Indeed. But in that case we can't tell which way is up. This is what I
>>>> was fretting about when I said:
>>>>> So I do worry these watchdog robustness fixes are papering over a
>>>>> problem, pushing expectations closer to the edge of how far the system
>>>>> should tolerate bad behavior. Because at some point we'll fall off. :)
>>>>
>>>> If the timer is delayed long enough for the watchdog to wrap, we're
>>>> way out of tolerable behavior. There's not much we can do because we
>>>> can't even tell what happened.
>>>>
>>>> But in the case where the watchdog has not wrapped, I don't see a
>>>> major issue with trying to be a bit more robust in the face of just
>>>> slightly delayed timers.
>>>> (And yes, we can't really distinguish between slightly delayed and
>>>> watchdog-wrap-interval + slight delay, but in either case we can
>>>> probably skip disqualifying the clocksource as we know something seems
>>>> off)
>>>
>>> If we record the watchdog's start_time in clocksource_start_watchdog(), and then
>>> when we verify cycles in clocksource_watchdog(), check whether the clocksource
>>> watchdog is blocked. Due to MSB verification, if the blocked time is greater than
>>> half of the watchdog timer max_cycles, then we can safely ignore the current
>>> verification? Do you think this idea is okay?
>>
>> I can't say I totally understand the idea. Maybe could you clarify with a patch?
>>
> 
> Sorry, it looks almost as follows:
> 
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index b8a14d2..87f3b67 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -119,6 +119,7 @@
>   static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
>   static DEFINE_SPINLOCK(watchdog_lock);
>   static int watchdog_running;
> +static unsigned long watchdog_start_time;
>   static atomic_t watchdog_reset_pending;
>   
>   static inline void clocksource_watchdog_lock(unsigned long *flags)
> @@ -356,6 +357,7 @@ static void clocksource_watchdog(struct timer_list *unused)
>   	int next_cpu, reset_pending;
>   	int64_t wd_nsec, cs_nsec;
>   	struct clocksource *cs;
> +	unsigned long max_jiffies;
>   	u32 md;
>   
>   	spin_lock(&watchdog_lock);
> @@ -402,6 +404,10 @@ static void clocksource_watchdog(struct timer_list *unused)
>   		if (atomic_read(&watchdog_reset_pending))
>   			continue;
>   
> +		max_jiffies = nsecs_to_jiffies(cs->max_idle_ns);
> +		if (time_is_before_jiffies(watchdog_start_time + max_jiffies))
> +			continue;
> +
>   		/* Check the deviation from the watchdog clocksource. */
>   		md = cs->uncertainty_margin + watchdog->uncertainty_margin;
>   		if (abs(cs_nsec - wd_nsec) > md) {
> @@ -474,6 +480,7 @@ static void clocksource_watchdog(struct timer_list *unused)
>   	 * pair clocksource_stop_watchdog() clocksource_start_watchdog().
>   	 */
>   	if (!timer_pending(&watchdog_timer)) {
> +		watchdog_start_time = jiffies;
>   		watchdog_timer.expires += WATCHDOG_INTERVAL;
>   		add_timer_on(&watchdog_timer, next_cpu);
>   	}
> @@ -488,6 +495,7 @@ static inline void clocksource_start_watchdog(void)
>   	timer_setup(&watchdog_timer, clocksource_watchdog, 0);
>   	watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
>   	add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask));
> +	watchdog_start_time = jiffies;
>   	watchdog_running = 1;
>   }
> 
> 
>> thanks
>> -john
>>

It looks good to me.
thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ