[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1616140838-24222-1-git-send-email-wangqing@vivo.com>
Date: Fri, 19 Mar 2021 16:00:36 +0800
From: Wang Qing <wangqing@...o.com>
To: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Guilherme G. Piccoli" <gpiccoli@...onical.com>,
Andrey Ignatov <rdna@...com>, Vlastimil Babka <vbabka@...e.cz>,
Petr Mladek <pmladek@...e.com>, Wang Qing <wangqing@...o.com>,
Santosh Sivaraj <santosh@...six.org>,
linux-kernel@...r.kernel.org
Subject: [PATCH V2] workqueue: watchdog: update wq_watchdog_touched for unbound lockup checking
When touch_softlockup_watchdog() is called, only wq_watchdog_touched_cpu
updated, while the unbound worker_pool running on its core uses
wq_watchdog_touched to determine whether locked up. This may be mischecked.
My suggestion is to update both when touch_softlockup_watchdog() is called,
use wq_watchdog_touched_cpu to check bound, and use wq_watchdog_touched
to check unbound worker_pool.
Signed-off-by: Wang Qing <wangqing@...o.com>
---
kernel/watchdog.c | 5 +++--
kernel/workqueue.c | 17 ++++++-----------
2 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7110906..107bc38
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -278,9 +278,10 @@ void touch_all_softlockup_watchdogs(void)
* update as well, the only side effect might be a cycle delay for
* the softlockup check.
*/
- for_each_cpu(cpu, &watchdog_allowed_mask)
+ for_each_cpu(cpu, &watchdog_allowed_mask) {
per_cpu(watchdog_touch_ts, cpu) = SOFTLOCKUP_RESET;
- wq_watchdog_touch(-1);
+ wq_watchdog_touch(cpu);
+ }
}
void touch_softlockup_watchdog_sync(void)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0d150da..be08295
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5787,22 +5787,17 @@ static void wq_watchdog_timer_fn(struct timer_list *unused)
continue;
/* get the latest of pool and touched timestamps */
+ if (pool->cpu >= 0)
+ touched = READ_ONCE(per_cpu(wq_watchdog_touched_cpu, pool->cpu));
+ else
+ touched = READ_ONCE(wq_watchdog_touched);
pool_ts = READ_ONCE(pool->watchdog_ts);
- touched = READ_ONCE(wq_watchdog_touched);
if (time_after(pool_ts, touched))
ts = pool_ts;
else
ts = touched;
- if (pool->cpu >= 0) {
- unsigned long cpu_touched =
- READ_ONCE(per_cpu(wq_watchdog_touched_cpu,
- pool->cpu));
- if (time_after(cpu_touched, ts))
- ts = cpu_touched;
- }
-
/* did we stall? */
if (time_after(jiffies, ts + thresh)) {
lockup_detected = true;
@@ -5826,8 +5821,8 @@ notrace void wq_watchdog_touch(int cpu)
{
if (cpu >= 0)
per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies;
- else
- wq_watchdog_touched = jiffies;
+
+ wq_watchdog_touched = jiffies;
}
static void wq_watchdog_set_thresh(unsigned long thresh)
--
2.7.4
Powered by blists - more mailing lists