lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Oct 2021 09:14:33 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     brookxu <brookxu.cn@...il.com>
Cc:     yanghui <yanghui.def@...edance.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Clocksource: Avoid misjudgment of clocksource

On Tue, Oct 12, 2021 at 1:06 AM brookxu <brookxu.cn@...il.com> wrote:
> John Stultz wrote on 2021/10/12 13:29:
> > On Mon, Oct 11, 2021 at 10:23 PM brookxu <brookxu.cn@...il.com> wrote:
> >> John Stultz wrote on 2021/10/12 12:52 下午:
> >>> On Sat, Oct 9, 2021 at 7:04 AM brookxu <brookxu.cn@...il.com> wrote:
> >> If we record the watchdog's start_time in clocksource_start_watchdog(), and then
> >> when we verify cycles in clocksource_watchdog(), check whether the clocksource
> >> watchdog is blocked. Due to MSB verification, if the blocked time is greater than
> >> half of the watchdog timer max_cycles, then we can safely ignore the current
> >> verification? Do you think this idea is okay?
> >
> > I can't say I totally understand the idea. Maybe could you clarify with a patch?
> >
>
> Sorry, it looks almost as follows:
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index b8a14d2..87f3b67 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -119,6 +119,7 @@
>  static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
>  static DEFINE_SPINLOCK(watchdog_lock);
>  static int watchdog_running;
> +static unsigned long watchdog_start_time;
>  static atomic_t watchdog_reset_pending;
>
>  static inline void clocksource_watchdog_lock(unsigned long *flags)
> @@ -356,6 +357,7 @@ static void clocksource_watchdog(struct timer_list *unused)
>         int next_cpu, reset_pending;
>         int64_t wd_nsec, cs_nsec;
>         struct clocksource *cs;
> +       unsigned long max_jiffies;
>         u32 md;
>
>         spin_lock(&watchdog_lock);
> @@ -402,6 +404,10 @@ static void clocksource_watchdog(struct timer_list *unused)
>                 if (atomic_read(&watchdog_reset_pending))
>                         continue;
>
> +               max_jiffies = nsecs_to_jiffies(cs->max_idle_ns);
> +               if (time_is_before_jiffies(watchdog_start_time + max_jiffies))
> +                       continue;
> +

Sorry, what is the benefit of using jiffies here?   Jiffies are
updated by counting the number of tick intervals on the current
clocksource.

This seems like circular logic, where we're trying to judge the
current clocksource by using something we derived from the current
clocksource.
That's why the watchdog clocksource is important, as it's supposed to
be a separate counter that is more reliable (but likely slower) then
the preferred clocksource.

So I'm not really sure how this helps.

The earlier patch by yanghui at least used the watchdog interval to
decide if the watchdog timer had expired late. Which seemed
reasonable, but I thought it might be helpful to add some sort of a
counter so if the case is happening repeatedly (timers constantly
being delayed) we have a better signal that the watchdog and current
clocksource are out of sync.  Because again, timers are fired based on
the current clocksource. So constant delays likely mean things are
wrong.

thanks
-john

thanks
-john

Powered by blists - more mailing lists