[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAK1f24=erjdUzFvCYUFr3L0Yq+bGC8iRmFMstAWd9QSsG98vWg@mail.gmail.com>
Date: Sun, 23 Feb 2025 10:58:17 +0800
From: Lance Yang <ioworker0@...il.com>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>,
Joel Granados <joel.granados@...nel.org>, Anna Schumaker <anna.schumaker@...cle.com>,
Kent Overstreet <kent.overstreet@...ux.dev>, Yongliang Gao <leonylgao@...cent.com>,
Steven Rostedt <rostedt@...dmis.org>, Tomasz Figa <tfiga@...omium.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/2] hung_task: Show the blocker task if the task is
hung on mutex
On Sat, Feb 22, 2025 at 10:03 AM Masami Hiramatsu (Google)
<mhiramat@...nel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
>
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
>
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
>
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.
>
> With this change, the hung task shows blocker task's info like below;
>
> INFO: task cat:115 blocked for more than 122 seconds.
> Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
> Call Trace:
> <TASK>
> __schedule+0x731/0x960
> ? schedule_preempt_disabled+0x54/0xa0
> schedule+0xb7/0x140
> ? __mutex_lock+0x51b/0xa60
> ? __mutex_lock+0x51b/0xa60
> schedule_preempt_disabled+0x54/0xa0
> __mutex_lock+0x51b/0xa60
> read_dummy+0x23/0x70
> full_proxy_read+0x6a/0xc0
> vfs_read+0xc2/0x340
> ? __pfx_direct_file_splice_eof+0x10/0x10
> ? do_sendfile+0x1bd/0x2e0
> ksys_read+0x76/0xe0
> do_syscall_64+0xe3/0x1c0
> ? exc_page_fault+0xa9/0x1d0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x4840cd
> RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
> RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
> </TASK>
> INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
> task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
> Call Trace:
> <TASK>
> __schedule+0x731/0x960
> ? schedule_timeout+0xa8/0x120
> schedule+0xb7/0x140
> schedule_timeout+0xa8/0x120
> ? __pfx_process_timeout+0x10/0x10
> msleep_interruptible+0x3e/0x60
> read_dummy+0x2d/0x70
> full_proxy_read+0x6a/0xc0
> vfs_read+0xc2/0x340
> ? __pfx_direct_file_splice_eof+0x10/0x10
> ? do_sendfile+0x1bd/0x2e0
> ksys_read+0x76/0xe0
> do_syscall_64+0xe3/0x1c0
> ? exc_page_fault+0xa9/0x1d0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x4840cd
> RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
> RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
> </TASK>
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
Nice. I just tried it out, and it works as expected. The overhead is
almost nothing based on my micro-benchmark ;)
Tested-by: Lance Yang <ioworker0@...il.com>
Thanks,
Lance
> ---
> Changes in v3:
> - Add RCU_LOCKDEP_WARN() to ensure rcu_read_lock() is held.
> - Cleanup code to make it fail-fast and add brace to
> for_each_process_thread().
> - Change the message to "likely owned" instead of "owned".
> Changes in v2:
> - Introduce CONFIG_DETECT_HUNG_TASK_BLOCKER for this feature.
> - Introduce task_struct::blocker_mutex to point the mutex.
> - Rename debug_mutex_get_owner() to mutex_get_owner().
> - Remove unneeded mutex_waiter::mutex.
> ---
> include/linux/mutex.h | 2 ++
> include/linux/sched.h | 4 ++++
> kernel/hung_task.c | 36 ++++++++++++++++++++++++++++++++++++
> kernel/locking/mutex.c | 14 ++++++++++++++
> lib/Kconfig.debug | 10 ++++++++++
> 5 files changed, 66 insertions(+)
>
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 2bf91b57591b..2143d05116be 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
> DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
> DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
>
> +extern unsigned long mutex_get_owner(struct mutex *lock);
> +
> #endif /* __LINUX_MUTEX_H */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 9632e3318e0d..0cebdd736d44 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1217,6 +1217,10 @@ struct task_struct {
> struct mutex_waiter *blocked_on;
> #endif
>
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> + struct mutex *blocker_mutex;
> +#endif
> +
> #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> int non_block_count;
> #endif
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 04efa7a6e69b..ccd7217fcec1 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
> .notifier_call = hung_task_panic,
> };
>
> +
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +static void debug_show_blocker(struct task_struct *task)
> +{
> + struct task_struct *g, *t;
> + unsigned long owner;
> + struct mutex *lock;
> +
> + RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
> +
> + lock = READ_ONCE(task->blocker_mutex);
> + if (!lock)
> + return;
> +
> + owner = mutex_get_owner(lock);
> + if (unlikely(!owner)) {
> + pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
> + task->comm, task->pid);
> + return;
> + }
> +
> + /* Ensure the owner information is correct. */
> + for_each_process_thread(g, t) {
> + if ((unsigned long)t == owner) {
> + pr_err("INFO: task %s:%d is blocked on a mutex likely owned by task %s:%d.\n",
> + task->comm, task->pid, t->comm, t->pid);
> + sched_show_task(t);
> + return;
> + }
> + }
> +}
> +#else
> +#define debug_show_blocker(t) do {} while (0)
> +#endif
> +
> static void check_hung_task(struct task_struct *t, unsigned long timeout)
> {
> unsigned long switch_count = t->nvcsw + t->nivcsw;
> @@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
> " disables this message.\n");
> sched_show_task(t);
> + debug_show_blocker(t);
> hung_task_show_lock = true;
>
> if (sysctl_hung_task_all_cpu_backtrace)
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index b36f23de48f1..6a543c204a14 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
> return owner & MUTEX_FLAGS;
> }
>
> +/* Do not use the return value as a pointer directly. */
> +unsigned long mutex_get_owner(struct mutex *lock)
> +{
> + unsigned long owner = atomic_long_read(&lock->owner);
> +
> + return (unsigned long)__owner_task(owner);
> +}
> +
> /*
> * Returns: __mutex_owner(lock) on failure or NULL on success.
> */
> @@ -180,6 +188,9 @@ static void
> __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
> struct list_head *list)
> {
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> + WRITE_ONCE(current->blocker_mutex, lock);
> +#endif
> debug_mutex_add_waiter(lock, waiter, current);
>
> list_add_tail(&waiter->list, list);
> @@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
> __mutex_clear_flag(lock, MUTEX_FLAGS);
>
> debug_mutex_remove_waiter(lock, waiter, current);
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> + WRITE_ONCE(current->blocker_mutex, NULL);
> +#endif
> }
>
> /*
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 1af972a92d06..91dfd8c10fbb 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1260,6 +1260,16 @@ config BOOTPARAM_HUNG_TASK_PANIC
>
> Say N if unsure.
>
> +config DETECT_HUNG_TASK_BLOCKER
> + bool "Dump Hung Tasks Blocker"
> + depends on DETECT_HUNG_TASK
> + default y
> + help
> + Say Y here to show the blocker task's stacktrace who acquires
> + the mutex lock which "hung tasks" are waiting.
> + This will add overhead a bit but shows suspicious tasks and
> + call trace if it comes from waiting a mutex.
> +
> config WQ_WATCHDOG
> bool "Detect Workqueue Stalls"
> depends on DEBUG_KERNEL
>
Powered by blists - more mailing lists