[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250222111249.70db403097e5b5ff0e1fc34d@kernel.org>
Date: Sat, 22 Feb 2025 11:12:49 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>, Joel
Granados <joel.granados@...nel.org>, Anna Schumaker
<anna.schumaker@...cle.com>, Lance Yang <ioworker0@...il.com>, Kent
Overstreet <kent.overstreet@...ux.dev>, Yongliang Gao
<leonylgao@...cent.com>, Steven Rostedt <rostedt@...dmis.org>, Tomasz Figa
<tfiga@...omium.org>, Sergey Senozhatsky <senozhatsky@...omium.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/2] hung_task: Dump the blocking task stacktrace
Sorry, I forgot to update the output example. Please ignore this.
On Sat, 22 Feb 2025 10:59:30 +0900
"Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:
> Hi,
>
> Here is the 3rd version of the dumping mutex blocker in hung_task
> message. The previous version is here;
>
> https://lore.kernel.org/all/174014819072.967666.10146255401631551816.stgit@mhiramat.tok.corp.google.com/
>
> This version fixes to add rcu_read_lock check, add braces for
> for_each_process_thread(), and change the message.
>
> The hung_task detector is very useful for detecting the lockup.
> However, since it only dumps the blocked (uninterruptible sleep)
> processes, it is not enough to identify the root cause of that
> lockup.
>
> For example, if a process holds a mutex and sleep an event in
> interruptible state long time, the other processes will wait on
> the mutex in uninterruptible state. In this case, the waiter
> processes are dumped, but the blocker process is not shown
> because it is sleep in interruptible state.
>
> This adds a feature to dump the blocker task which holds a mutex
> when detecting a hung task. e.g.
>
> INFO: task cat:113 blocked for more than 122 seconds.
> Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:cat state:D stack:13432 pid:113 tgid:113 ppid:103 task_flags:0x400100 flags:0x00000002
> Call Trace:
> <TASK>
> __schedule+0x731/0x960
> ? schedule_preempt_disabled+0x54/0xa0
> schedule+0xb7/0x140
> ? __mutex_lock+0x51d/0xa50
> ? __mutex_lock+0x51d/0xa50
> schedule_preempt_disabled+0x54/0xa0
> __mutex_lock+0x51d/0xa50
> ? current_time+0x3a/0x120
> read_dummy+0x23/0x70
> full_proxy_read+0x6a/0xc0
> vfs_read+0xc2/0x340
> ? __pfx_direct_file_splice_eof+0x10/0x10
> ? do_sendfile+0x1bd/0x2e0
> ksys_read+0x76/0xe0
> do_syscall_64+0xe3/0x1c0
> ? exc_page_fault+0xa9/0x1d0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x4840cd
> RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
> RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
> </TASK>
> INFO: task cat:113 is blocked on a mutex owned by task cat:112.
> task:cat state:S stack:13432 pid:112 tgid:112 ppid:103 task_flags:0x400100 flags:0x00000002
> Call Trace:
> <TASK>
> __schedule+0x731/0x960
> ? schedule_timeout+0xa8/0x120
> schedule+0xb7/0x140
> schedule_timeout+0xa8/0x120
> ? __pfx_process_timeout+0x10/0x10
> msleep_interruptible+0x3e/0x60
> read_dummy+0x2d/0x70
> full_proxy_read+0x6a/0xc0
> vfs_read+0xc2/0x340
> ? __pfx_direct_file_splice_eof+0x10/0x10
> ? do_sendfile+0x1bd/0x2e0
> ksys_read+0x76/0xe0
> do_syscall_64+0xe3/0x1c0
> ? exc_page_fault+0xa9/0x1d0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x4840cd
> RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
> RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
> </TASK>
>
> TBD:
> We can extend this feature to cover other locks like rwsem and rt_mutex,
> but rwsem requires to dump all the tasks which acquire and wait that
> rwsem. We can follow the waiter link but the output will be a bit
> different compared with mutex case.
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (Google) (2):
> hung_task: Show the blocker task if the task is hung on mutex
> samples: Add hung_task detector mutex blocking sample
>
>
> include/linux/mutex.h | 2 +
> include/linux/sched.h | 4 ++
> kernel/hung_task.c | 36 +++++++++++++++++++
> kernel/locking/mutex.c | 14 +++++++
> lib/Kconfig.debug | 10 +++++
> samples/Kconfig | 9 +++++
> samples/Makefile | 1 +
> samples/hung_task/Makefile | 2 +
> samples/hung_task/hung_task_mutex.c | 66 +++++++++++++++++++++++++++++++++++
> 9 files changed, 144 insertions(+)
> create mode 100644 samples/hung_task/Makefile
> create mode 100644 samples/hung_task/hung_task_mutex.c
>
> --
> Masami Hiramatsu (Google) <mhiramat@...nel.org>
--
Masami Hiramatsu (Google) <mhiramat@...nel.org>
Powered by blists - more mailing lists