lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210923153339.437521634@linutronix.de>
Date:   Thu, 23 Sep 2021 18:04:20 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     LKML <linux-kernel@...r.kernel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Dmitry Vyukov <dvyukov@...gle.com>
Subject: [patch 01/11] hrtimer: Add a mechanism to catch runaway timers

A recent report from syzbot unearthed a problem with self rearming timers
which fire late and try to catch up to now with a short period. That causes
the timer to be rearmed in the past until it eventually catches up with
now. If that rearming happens from the timer callback the hard or soft
interrupt expiry loop can run for a long time with either interrupts or
bottom halves disabled which causes RCU stalls and other lockups.

There is no safety net to stop or at least identify such runaway timers.

Detection is trivial. Cache the pointer to the last expired timer. The next
invocation from the same loop compares the pointer with the next expiring
hrtimer pointer and if they match 10 times in a row (in the same hard or
soft interrupt expiry instance) then it's reasonable to declare it as a
runaway.

In that case emit a warning and skip the callback invocation which stops
the misbehaving timer right there.

It's obviously incomplete, but it's definitely better than nothing and would
have caught the reported issue in mac80211_hwsim.

Suggested-by: Dmitry Vyukov <dvyukov@...gle.com>
Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
---
 kernel/time/hrtimer.c |   25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1714,6 +1714,13 @@ static void __run_hrtimer(struct hrtimer
 	base->running = NULL;
 }
 
+static void hrtimer_del_runaway(struct hrtimer_clock_base *base,
+				struct hrtimer *timer)
+{
+	__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
+	pr_warn("Runaway hrtimer %p %ps stopped\n", timer, timer->function);
+}
+
 static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now,
 				 unsigned long flags, unsigned int active_mask)
 {
@@ -1722,6 +1729,8 @@ static void __hrtimer_run_queues(struct
 
 	for_each_active_base(base, cpu_base, active) {
 		struct timerqueue_node *node;
+		struct hrtimer *last = NULL;
+		unsigned int cnt = 0;
 		ktime_t basenow;
 
 		basenow = ktime_add(now, base->offset);
@@ -1732,6 +1741,22 @@ static void __hrtimer_run_queues(struct
 			timer = container_of(node, struct hrtimer, node);
 
 			/*
+			 * Catch timers which rearm themself with a expiry
+			 * time in the past over and over which makes this
+			 * loop run forever.
+			 */
+			if (IS_ENABLED(CONFIG_DEBUG_OBJECTS_TIMERS)) {
+				if (unlikely(last == timer)) {
+					if (++cnt == 10) {
+						hrtimer_del_runaway(base, timer);
+						continue;
+					}
+				}
+				last = timer;
+				cnt = 0;
+			}
+
+			/*
 			 * The immediate goal for using the softexpires is
 			 * minimizing wakeups, not running timers at the
 			 * earliest interrupt after their soft expiration.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ