lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20181115170701.4696-1-mzhivich@akamai.com>
Date:   Thu, 15 Nov 2018 12:07:01 -0500
From:   Michael Zhivich <mzhivich@...mai.com>
To:     linux-kernel@...r.kernel.org
Cc:     tiny.windzz@...il.com, joel@...lfernandes.org,
        alexander.levin@...izon.com, frederic@...nel.org,
        bigeasy@...utronix.de, mingo@...nel.org, rostedt@...dmis.org,
        paulmck@...ux.vnet.ibm.com, tglx@...utronix.de,
        john.stultz@...aro.org, arnd@...db.de, omosnace@...hat.com,
        jason.wessel@...driver.com, kreview@...mai.com,
        Michael Zhivich <mzhivich@...mai.com>
Subject: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
ksoftirqd runs, thus preventing problems with reading clocksources that
wrap often (e.g. acpi_pm).

If acpi_pm is used as the clocksource watchdog, and machine is under heavy
load, the time period for the watchdog check may be significantly longer
than the requested 0.5 seconds.  If the watchdog check is delayed by 2
seconds (observed behavior), then acpi_pm time delta will be

    2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f

which will be treated as negative (since acpi_pm is only 24-bits wide) and
truncated to 0.  This behavior will cause tsc to be incorrectly declared
unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
If the clocksource watchdog check is delayed by more than 4.7 sec, then the
acpi_pm clocksource will wrap altogether and produce incorrect time delta.

The likely cause of this delay is that timer interrupts are serviced in
ksoftirqd when the machine is very busy.

Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
synchronous"):
   ...
   We should probably also consider the timer softirqs to be synchronous
   and not be delayed to ksoftirqd (since they were the issue with the
   earlier watchdog problems), but that should be done as a separate patch.
   ...

Signed-off-by: Michael Zhivich <mzhivich@...mai.com>
---
 kernel/softirq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index d28813306b2c..6d517ce0fba8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
  * right now. Let ksoftirqd handle this at its own rate, to get fairness,
  * unless we're doing some of the synchronous softirqs.
  */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+#define SOFTIRQ_NOW_MASK \
+	((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
 static bool ksoftirqd_running(unsigned long pending)
 {
 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
-- 
2.19.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ