lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30f8fa10-4527-4e03-91c6-638834412466@redhat.com>
Date: Sun, 23 Feb 2025 14:53:28 -0500
From: Waiman Long <llong@...hat.com>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>
Cc: Boqun Feng <boqun.feng@...il.com>,
 Joel Granados <joel.granados@...nel.org>,
 Anna Schumaker <anna.schumaker@...cle.com>, Lance Yang
 <ioworker0@...il.com>, Kent Overstreet <kent.overstreet@...ux.dev>,
 Yongliang Gao <leonylgao@...cent.com>, Steven Rostedt <rostedt@...dmis.org>,
 Tomasz Figa <tfiga@...omium.org>,
 Sergey Senozhatsky <senozhatsky@...omium.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/2] hung_task: Show the blocker task if the task is
 hung on mutex

On 2/23/25 2:34 PM, Waiman Long wrote:
> On 2/21/25 9:03 PM, Masami Hiramatsu (Google) wrote:
>> From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
>>
>> The "hung_task" shows a long-time uninterruptible slept task, but most
>> often, it's blocked on a mutex acquired by another task. Without
>> dumping such a task, investigating the root cause of the hung task
>> problem is very difficult.
>>
>> This introduce task_struct::blocker_mutex to point the mutex lock
>> which this task is waiting for. Since the mutex has "owner"
>> information, we can find the owner task and dump it with hung tasks.
>>
>> Note: the owner can be changed while dumping the owner task, so
>> this is "likely" the owner of the mutex.
>>
>> With this change, the hung task shows blocker task's info like below;
>>
>>   INFO: task cat:115 blocked for more than 122 seconds.
>>         Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>>   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
>> message.
>>   task:cat             state:D stack:13432 pid:115   tgid:115 
>> ppid:106    task_flags:0x400100 flags:0x00000002
>>   Call Trace:
>>    <TASK>
>>    __schedule+0x731/0x960
>>    ? schedule_preempt_disabled+0x54/0xa0
>>    schedule+0xb7/0x140
>>    ? __mutex_lock+0x51b/0xa60
>>    ? __mutex_lock+0x51b/0xa60
>>    schedule_preempt_disabled+0x54/0xa0
>>    __mutex_lock+0x51b/0xa60
>>    read_dummy+0x23/0x70
>>    full_proxy_read+0x6a/0xc0
>>    vfs_read+0xc2/0x340
>>    ? __pfx_direct_file_splice_eof+0x10/0x10
>>    ? do_sendfile+0x1bd/0x2e0
>>    ksys_read+0x76/0xe0
>>    do_syscall_64+0xe3/0x1c0
>>    ? exc_page_fault+0xa9/0x1d0
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>   RIP: 0033:0x4840cd
>>   RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>>   RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>>   RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>>   R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>>    </TASK>
>>   INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>>   task:cat             state:S stack:13432 pid:114   tgid:114 
>> ppid:106    task_flags:0x400100 flags:0x00000002
>>   Call Trace:
>>    <TASK>
>>    __schedule+0x731/0x960
>>    ? schedule_timeout+0xa8/0x120
>>    schedule+0xb7/0x140
>>    schedule_timeout+0xa8/0x120
>>    ? __pfx_process_timeout+0x10/0x10
>>    msleep_interruptible+0x3e/0x60
>>    read_dummy+0x2d/0x70
>>    full_proxy_read+0x6a/0xc0
>>    vfs_read+0xc2/0x340
>>    ? __pfx_direct_file_splice_eof+0x10/0x10
>>    ? do_sendfile+0x1bd/0x2e0
>>    ksys_read+0x76/0xe0
>>    do_syscall_64+0xe3/0x1c0
>>    ? exc_page_fault+0xa9/0x1d0
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>   RIP: 0033:0x4840cd
>>   RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>>   RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>>   RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>>   R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>>    </TASK>
>>
>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
>> ---
>>   Changes in v3:
>>    - Add RCU_LOCKDEP_WARN() to ensure rcu_read_lock() is held.
>>    - Cleanup code to make it fail-fast and add brace to
>>      for_each_process_thread().
>>    - Change the message to "likely owned" instead of "owned".
>>   Changes in v2:
>>    - Introduce CONFIG_DETECT_HUNG_TASK_BLOCKER for this feature.
>>    - Introduce task_struct::blocker_mutex to point the mutex.
>>    - Rename debug_mutex_get_owner() to mutex_get_owner().
>>    - Remove unneeded mutex_waiter::mutex.
>> ---
>>   include/linux/mutex.h  |    2 ++
>>   include/linux/sched.h  |    4 ++++
>>   kernel/hung_task.c     |   36 ++++++++++++++++++++++++++++++++++++
>>   kernel/locking/mutex.c |   14 ++++++++++++++
>>   lib/Kconfig.debug      |   10 ++++++++++
>>   5 files changed, 66 insertions(+)
>>
>> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
>> index 2bf91b57591b..2143d05116be 100644
>> --- a/include/linux/mutex.h
>> +++ b/include/linux/mutex.h
>> @@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, 
>> mutex_lock(_T), mutex_unlock(_T))
>>   DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
>>   DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
>>   +extern unsigned long mutex_get_owner(struct mutex *lock);
>> +
>>   #endif /* __LINUX_MUTEX_H */
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 9632e3318e0d..0cebdd736d44 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1217,6 +1217,10 @@ struct task_struct {
>>       struct mutex_waiter        *blocked_on;
>>   #endif
>>   +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>> +    struct mutex            *blocker_mutex;
>> +#endif
>> +
>>   #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>>       int                non_block_count;
>>   #endif
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 04efa7a6e69b..ccd7217fcec1 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
>>       .notifier_call = hung_task_panic,
>>   };
>>   +
>> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>> +static void debug_show_blocker(struct task_struct *task)

I think the easiest way forward is to change it to "#if 
defined(CONFIG_DETECT_HUNG_TASK_BLOCKER) && !defined(CONFIG_PREEMPT_RT)" 
to skip this feature for PREEMPT_RT right now . Otherwise, you will have 
to deal with the rtmutex code.

Cheers,
Longman

>> +{
>> +    struct task_struct *g, *t;
>> +    unsigned long owner;
>> +    struct mutex *lock;
>> +
>> +    RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
>> +
>> +    lock = READ_ONCE(task->blocker_mutex);
>> +    if (!lock)
>> +        return;
>> +
>> +    owner = mutex_get_owner(lock);
>> +    if (unlikely(!owner)) {
>> +        pr_err("INFO: task %s:%d is blocked on a mutex, but the 
>> owner is not found.\n",
>> +            task->comm, task->pid);
>> +        return;
>> +    }
>> +
>> +    /* Ensure the owner information is correct. */
>> +    for_each_process_thread(g, t) {
>> +        if ((unsigned long)t == owner) {
>> +            pr_err("INFO: task %s:%d is blocked on a mutex likely 
>> owned by task %s:%d.\n",
>> +                task->comm, task->pid, t->comm, t->pid);
>> +            sched_show_task(t);
>> +            return;
>> +        }
>> +    }
>> +}
>> +#else
>> +#define debug_show_blocker(t)    do {} while (0)
>> +#endif
>> +
>>   static void check_hung_task(struct task_struct *t, unsigned long 
>> timeout)
>>   {
>>       unsigned long switch_count = t->nvcsw + t->nivcsw;
>> @@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct 
>> *t, unsigned long timeout)
>>           pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
>>               " disables this message.\n");
>>           sched_show_task(t);
>> +        debug_show_blocker(t);
>>           hung_task_show_lock = true;
>>             if (sysctl_hung_task_all_cpu_backtrace)
>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>> index b36f23de48f1..6a543c204a14 100644
>> --- a/kernel/locking/mutex.c
>> +++ b/kernel/locking/mutex.c
>> @@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned 
>> long owner)
>>       return owner & MUTEX_FLAGS;
>>   }
>>   +/* Do not use the return value as a pointer directly. */
>> +unsigned long mutex_get_owner(struct mutex *lock)
>> +{
>> +    unsigned long owner = atomic_long_read(&lock->owner);
>> +
>> +    return (unsigned long)__owner_task(owner);
>> +}
>> +
>>   /*
>>    * Returns: __mutex_owner(lock) on failure or NULL on success.
>>    */
>> @@ -180,6 +188,9 @@ static void
>>   __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
>>              struct list_head *list)
>>   {
>> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>> +    WRITE_ONCE(current->blocker_mutex, lock);
>> +#endif
>>       debug_mutex_add_waiter(lock, waiter, current);
>>         list_add_tail(&waiter->list, list);
>> @@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct 
>> mutex_waiter *waiter)
>>           __mutex_clear_flag(lock, MUTEX_FLAGS);
>>         debug_mutex_remove_waiter(lock, waiter, current);
>> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>> +    WRITE_ONCE(current->blocker_mutex, NULL);
>> +#endif
>>   }
>
> This patch looks good, but there is a PREEMPT_RT complication that you 
> need to handle as well. Most of the mutex.c is compiled out if 
> CONFIG_PREEMPT_RT is defined.
>
> You can either add the CONFIG_PREEMPT_RT check in your 
> debug_show_blocker() function or in the mutex_get_owner() declaration 
> in mutex.h.
>
> You should enable CONFIG_PREEMPT_RT in a test build to make sure that 
> nothing break.
>
> Cheers,
> Longman
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ