lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250222111249.70db403097e5b5ff0e1fc34d@kernel.org>
Date: Sat, 22 Feb 2025 11:12:49 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
 Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>, Joel
 Granados <joel.granados@...nel.org>, Anna Schumaker
 <anna.schumaker@...cle.com>, Lance Yang <ioworker0@...il.com>, Kent
 Overstreet <kent.overstreet@...ux.dev>, Yongliang Gao
 <leonylgao@...cent.com>, Steven Rostedt <rostedt@...dmis.org>, Tomasz Figa
 <tfiga@...omium.org>, Sergey Senozhatsky <senozhatsky@...omium.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/2] hung_task: Dump the blocking task stacktrace

Sorry, I forgot to update the output example. Please ignore this. 

On Sat, 22 Feb 2025 10:59:30 +0900
"Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:

> Hi,
> 
> Here is the 3rd version of the dumping mutex blocker in hung_task
> message. The previous version is here;
> 
> https://lore.kernel.org/all/174014819072.967666.10146255401631551816.stgit@mhiramat.tok.corp.google.com/
> 
> This version fixes to add rcu_read_lock check, add braces for
> for_each_process_thread(), and change the message.
> 
> The hung_task detector is very useful for detecting the lockup.
> However, since it only dumps the blocked (uninterruptible sleep)
> processes, it is not enough to identify the root cause of that
> lockup.
> 
> For example, if a process holds a mutex and sleep an event in
> interruptible state long time, the other processes will wait on
> the mutex in uninterruptible state. In this case, the waiter
> processes are dumped, but the blocker process is not shown
> because it is sleep in interruptible state.
> 
> This adds a feature to dump the blocker task which holds a mutex
> when detecting a hung task. e.g.
> 
>  INFO: task cat:113 blocked for more than 122 seconds.
>        Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  task:cat             state:D stack:13432 pid:113   tgid:113   ppid:103    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_preempt_disabled+0x54/0xa0
>   schedule+0xb7/0x140
>   ? __mutex_lock+0x51d/0xa50
>   ? __mutex_lock+0x51d/0xa50
>   schedule_preempt_disabled+0x54/0xa0
>   __mutex_lock+0x51d/0xa50
>   ? current_time+0x3a/0x120
>   read_dummy+0x23/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
>  RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>  INFO: task cat:113 is blocked on a mutex owned by task cat:112.
>  task:cat             state:S stack:13432 pid:112   tgid:112   ppid:103    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_timeout+0xa8/0x120
>   schedule+0xb7/0x140
>   schedule_timeout+0xa8/0x120
>   ? __pfx_process_timeout+0x10/0x10
>   msleep_interruptible+0x3e/0x60
>   read_dummy+0x2d/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
>  RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
> 
> TBD:
> We can extend this feature to cover other locks like rwsem and rt_mutex,
> but rwsem requires to dump all the tasks which acquire and wait that
> rwsem. We can follow the waiter link but the output will be a bit
> different compared with mutex case.
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (Google) (2):
>       hung_task: Show the blocker task if the task is hung on mutex
>       samples: Add hung_task detector mutex blocking sample
> 
> 
>  include/linux/mutex.h               |    2 +
>  include/linux/sched.h               |    4 ++
>  kernel/hung_task.c                  |   36 +++++++++++++++++++
>  kernel/locking/mutex.c              |   14 +++++++
>  lib/Kconfig.debug                   |   10 +++++
>  samples/Kconfig                     |    9 +++++
>  samples/Makefile                    |    1 +
>  samples/hung_task/Makefile          |    2 +
>  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
>  9 files changed, 144 insertions(+)
>  create mode 100644 samples/hung_task/Makefile
>  create mode 100644 samples/hung_task/hung_task_mutex.c
> 
> --
> Masami Hiramatsu (Google) <mhiramat@...nel.org>


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ