lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAK1f24nbxP2csK=yU=umVbKHUgjWiEVaeaJ_7Yk1NtgKu21NOg@mail.gmail.com>
Date: Fri, 14 Mar 2025 22:36:11 +0800
From: Lance Yang <ioworker0@...il.com>
To: akpm@...ux-foundation.org
Cc: will@...nel.org, peterz@...radead.org, mingo@...hat.com, 
	longman@...hat.com, mhiramat@...nel.org, anna.schumaker@...cle.com, 
	boqun.feng@...il.com, joel.granados@...nel.org, kent.overstreet@...ux.dev, 
	leonylgao@...cent.com, linux-kernel@...r.kernel.org, rostedt@...dmis.org, 
	senozhatsky@...omium.org, tfiga@...omium.org, amaindex@...look.com
Subject: Re: [PATCH v2 0/3] hung_task: extend blocking task stacktrace dump to semaphore

Oops, I got the version wrong and will resend the new one right away.

Thanks,
Lance

On Fri, Mar 14, 2025 at 10:29 PM Lance Yang <ioworker0@...il.com> wrote:
>
> Hi all,
>
> Inspired by mutex blocker tracking[1], this patch series extend the
> feature to not only dump the blocker task holding a mutex but also to
> support semaphores. Unlike mutexes, semaphores lack explicit ownership
> tracking, making it challenging to identify the root cause of hangs. To
> address this, we introduce a last_holder field to the semaphore structure,
> which is updated when a task successfully calls down() and cleared during
> up().
>
> The assumption is that if a task is blocked on a semaphore, the holders
> must not have released it. While this does not guarantee that the last
> holder is one of the current blockers, it likely provides a practical hint
> for diagnosing semaphore-related stalls.
>
> With this change, the hung task detector can now show blocker task's info
> like below:
>
> [Thu Mar 13 15:18:38 2025] INFO: task cat:1803 blocked for more than 122 seconds.
> [Thu Mar 13 15:18:38 2025]       Tainted: G           OE      6.14.0-rc3+ #14
> [Thu Mar 13 15:18:38 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Thu Mar 13 15:18:38 2025] task:cat             state:D stack:0     pid:1803  tgid:1803  ppid:1057   task_flags:0x400000 flags:0x00000004
> [Thu Mar 13 15:18:38 2025] Call trace:
> [Thu Mar 13 15:18:38 2025]  __switch_to+0x1ec/0x380 (T)
> [Thu Mar 13 15:18:38 2025]  __schedule+0xc30/0x44f8
> [Thu Mar 13 15:18:38 2025]  schedule+0xb8/0x3b0
> [Thu Mar 13 15:18:38 2025]  schedule_timeout+0x1d0/0x208
> [Thu Mar 13 15:18:38 2025]  __down_common+0x2d4/0x6f8
> [Thu Mar 13 15:18:38 2025]  __down+0x24/0x50
> [Thu Mar 13 15:18:38 2025]  down+0xd0/0x140
> [Thu Mar 13 15:18:38 2025]  read_dummy+0x3c/0xa0 [hung_task_sem]
> [Thu Mar 13 15:18:38 2025]  full_proxy_read+0xfc/0x1d0
> [Thu Mar 13 15:18:38 2025]  vfs_read+0x1a0/0x858
> [Thu Mar 13 15:18:38 2025]  ksys_read+0x100/0x220
> [Thu Mar 13 15:18:38 2025]  __arm64_sys_read+0x78/0xc8
> [Thu Mar 13 15:18:38 2025]  invoke_syscall+0xd8/0x278
> [Thu Mar 13 15:18:38 2025]  el0_svc_common.constprop.0+0xb8/0x298
> [Thu Mar 13 15:18:38 2025]  do_el0_svc+0x4c/0x88
> [Thu Mar 13 15:18:38 2025]  el0_svc+0x44/0x108
> [Thu Mar 13 15:18:38 2025]  el0t_64_sync_handler+0x134/0x160
> [Thu Mar 13 15:18:38 2025]  el0t_64_sync+0x1b8/0x1c0
> [Thu Mar 13 15:18:38 2025] INFO: task cat:1803 blocked on a semaphore likely last held by task cat:1802
> [Thu Mar 13 15:18:38 2025] task:cat             state:S stack:0     pid:1802  tgid:1802  ppid:1057   task_flags:0x400000 flags:0x00000004
> [Thu Mar 13 15:18:38 2025] Call trace:
> [Thu Mar 13 15:18:38 2025]  __switch_to+0x1ec/0x380 (T)
> [Thu Mar 13 15:18:38 2025]  __schedule+0xc30/0x44f8
> [Thu Mar 13 15:18:38 2025]  schedule+0xb8/0x3b0
> [Thu Mar 13 15:18:38 2025]  schedule_timeout+0xf4/0x208
> [Thu Mar 13 15:18:38 2025]  msleep_interruptible+0x70/0x130
> [Thu Mar 13 15:18:38 2025]  read_dummy+0x48/0xa0 [hung_task_sem]
> [Thu Mar 13 15:18:38 2025]  full_proxy_read+0xfc/0x1d0
> [Thu Mar 13 15:18:38 2025]  vfs_read+0x1a0/0x858
> [Thu Mar 13 15:18:38 2025]  ksys_read+0x100/0x220
> [Thu Mar 13 15:18:38 2025]  __arm64_sys_read+0x78/0xc8
> [Thu Mar 13 15:18:38 2025]  invoke_syscall+0xd8/0x278
> [Thu Mar 13 15:18:38 2025]  el0_svc_common.constprop.0+0xb8/0x298
> [Thu Mar 13 15:18:38 2025]  do_el0_svc+0x4c/0x88
> [Thu Mar 13 15:18:38 2025]  el0_svc+0x44/0x108
> [Thu Mar 13 15:18:38 2025]  el0t_64_sync_handler+0x134/0x160
> [Thu Mar 13 15:18:38 2025]  el0t_64_sync+0x1b8/0x1c0
>
> [1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com
>
> Thanks,
> Lance
>
> ---
> v1 -> v2:
>  * Use one field to store the blocker as only one is active at a time,
>  suggested by Masami
>  * Leverage the LSB of the blocker field to reduce memory footprint,
>  suggested by Masami
>  * Add a hung_task detector semaphore blocking test sample code
>  * https://lore.kernel.org/all/20250301055102.88746-1-ioworker0@gmail.com
>
> Lance Yang (2):
>   hung_task: replace blocker_mutex with encoded blocker
>   hung_task: show the blocker task if the task is hung on semaphore
>
> Zi Li (1):
>   samples: add hung_task detector semaphore blocking sample
>
>  include/linux/hung_task.h               | 94 +++++++++++++++++++++++++
>  include/linux/sched.h                   |  2 +-
>  include/linux/semaphore.h               | 15 +++-
>  kernel/hung_task.c                      | 52 +++++++++++---
>  kernel/locking/mutex.c                  |  8 ++-
>  kernel/locking/semaphore.c              | 55 +++++++++++++--
>  samples/Kconfig                         | 11 +--
>  samples/hung_task/Makefile              |  3 +-
>  samples/hung_task/hung_task_mutex.c     | 20 ++++--
>  samples/hung_task/hung_task_semaphore.c | 74 +++++++++++++++++++
>  10 files changed, 301 insertions(+), 33 deletions(-)
>  create mode 100644 include/linux/hung_task.h
>  create mode 100644 samples/hung_task/hung_task_semaphore.c
>
> --
> 2.45.2
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ