linux-kernel - Re: [PATCH RFC] clocksource: Detect a watchdog overflow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877fg9n8d4.fsf@ni.com>
Date:	Thu, 7 Apr 2016 01:14:31 -0700
From:	Gratian Crisan <gratian.crisan@...com>
To:	John Stultz <john.stultz@...aro.org>
CC:	Gratian Crisan <gratian.crisan@...com>,
	Thomas Gleixner <tglx@...utronix.de>,
	lkml <linux-kernel@...r.kernel.org>,
	Gratian Crisan <gratian@...il.com>
Subject: Re: [PATCH RFC] clocksource: Detect a watchdog overflow


John Stultz writes:

> On Tue, Mar 15, 2016 at 11:50 AM, Gratian Crisan <gratian.crisan@...com> wrote:
>> The clocksource watchdog can falsely trigger and disable the main
>> clocksource when the watchdog wraps around.
>>
>> The reason is that an interrupt storm and/or high priority (FIFO/RR) tasks
>> can preempt the timer softirq long enough for the watchdog to wrap around
>> if it has a limited number of bits available by comparison to the main
>> clocksource. One observed example is on a Intel Baytrail platform where TSC
>> is the main clocksource, HPET is disabled due to a hardware bug and acpi_pm
>> gets selected as the watchdog clocksource.
>>
>> Calculate the maximum number of nanoseconds the watchdog clocksource can
>> represent without overflow and do not disqualify the main clocksource if
>> the delta since the last time we have checked exceeds the measurement
>> capabilities of the watchdog clocksource.
>
> Sorry for not getting back to you sooner on this. You managed to send
> these both out while I was at a conference and on vacation, and so
> they were deep in the mail backlog. :)

No worries, I'm actually "on the road" this week too (ELC). I appreciate
the reply.

> So I'm sympathetic to this issue, because I remember seeing similar
> problems w/ runaway SCHED_FIFO tasks w/ PREEMPT_RT.

Yeah, a runaway rt thread can easily do it. That's just bad design. In
our case it was a bit more subtle bc. it was a combination of high
priority interrupts and rt threads that would occasionally stack up to
delay the timer softirq long enough to cause the watchdog wrap.

> However, its really difficult to create a solution without opening new
> cases where bad clocksources will be mis-identified as good (which
> your solution seems to suffer as well, measuring the time past with a
> known bad clocksource can easily result in large deltas, which will be
> ignored if the watchdog has a short interval).

Fair point. Ultimately you have to trust one of the clocksources. I
guess I was naive in thinking that the main clocksource can't drift more
than what the watchdog clocksource can measure within the
WATCHDOG_INTERVAL. I'm glad I don't have to deal with hardware that
lobotomized.

Would a simple solution that exposes the config option for the
clocksource wathchdog[1] (and defaults it to on) be an acceptable
alternative? It will work for us because we test the stability of the
main clocksource - part of the hardware bring-up.

> A previous effort on this was made here, and there's a resulting
> thread that didn't come to resolution:
>     https://lkml.org/lkml/2015/8/17/542

Sorry I've missed it.

> Way back I had tried to come up with an approach where if the time
> delta was large, it was divided by the watchdog interval, and then we
> just compared the remainder with the current watchdog delta to see if
> they matched (closely enough). Unfortunately this didn't work out for
> me then, but perhaps it deserves a second try?

I've entertained that idea too but I think I was trying to optimize
things too early and do everything with the mult/shift math. That first
attempt failed but I do need to try harder because it would be a better
general solution.

> thanks
> -john

Thanks,
-Gratian

[1]

>From e942ddaba439cd6711e9eed44ceae34167b864f8 Mon Sep 17 00:00:00 2001
From: Gratian Crisan <gratian.crisan@...com>
Date: Wed, 6 Apr 2016 21:20:15 -0700
Subject: [PATCH] time: Make the clocksource watchdog user configurable

The clocksource watchdog is used to detect instabilities in the current
clocksource. This is a beneficial feature on new/unknown hardware however
it can create problems by falsely triggering when the watchdog wraps. The
reason is that an interrupt storm and/or high priority (FIFO/RR) tasks can
preempt the timer softirq long enough for the watchdog to wrap if it has a
limited number of bits available by comparison with the main clocksource.

One observed example is on a Intel Baytrail platform where TSC is the main
clocksource, HPET is disabled due to a hardware bug and acpi_pm gets
selected as the watchdog clocksource.

Provide the option to disable the clocksource watchdog for hardware where
the clocksource stability has been validated.

Signed-off-by: Gratian Crisan <gratian.crisan@...com>
---
 arch/x86/Kconfig    |  2 +-
 kernel/time/Kconfig | 12 +++++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2dc18605..6da5d9e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -54,7 +54,7 @@ config X86
 	select CLKEVT_I8253
 	select CLKSRC_I8253			if X86_32
 	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
-	select CLOCKSOURCE_WATCHDOG
+	select HAVE_CLOCKSOURCE_WATCHDOG
 	select CLONE_BACKWARDS			if X86_32
 	select COMPAT_OLD_SIGACTION		if IA32_EMULATION
 	select DCACHE_WORD_ACCESS
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 4008d9f..6707f1d 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -5,7 +5,7 @@
 # Options selectable by arch Kconfig
 
 # Watchdog function for clocksources to detect instabilities
-config CLOCKSOURCE_WATCHDOG
+config HAVE_CLOCKSOURCE_WATCHDOG
 	bool
 
 # Architecture has extra clocksource data
@@ -193,5 +193,15 @@ config HIGH_RES_TIMERS
 	  hardware is not capable then this option only increases
 	  the size of the kernel image.
 
+config CLOCKSOURCE_WATCHDOG
+	bool "Clocksource watchdog"
+	depends on HAVE_CLOCKSOURCE_WATCHDOG
+	default y
+	help
+	  This option enables the watchdog function for clocksources. It is
+	  used to detect instabilities in the currently selected clocksource.
+
+	  Say Y if you are unsure.
+
 endmenu
 endif
-- 
1.9.1