lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b84bf315-5910-41af-bc0e-525b791fcf6f@linux.dev>
Date: Tue, 24 Jun 2025 12:25:44 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: akpm@...ux-foundation.org, zi.li@...ux.dev, anna.schumaker@...cle.com,
 boqun.feng@...il.com, joel.granados@...nel.org, jstultz@...gle.com,
 kent.overstreet@...ux.dev, leonylgao@...cent.com,
 linux-kernel@...r.kernel.org, longman@...hat.com, mingo@...hat.com,
 mingzhe.yang@...com, peterz@...radead.org, rostedt@...dmis.org,
 senozhatsky@...omium.org, tfiga@...omium.org, will@...nel.org,
 Lance Yang <ioworker0@...il.com>
Subject: Re: [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to
 rwsems



On 2025/6/24 11:57, Masami Hiramatsu (Google) wrote:
> On Thu, 12 Jun 2025 12:19:26 +0800
> Lance Yang <ioworker0@...il.com> wrote:
> 
>> From: Lance Yang <lance.yang@...ux.dev>
>>
>> Inspired by mutex blocker tracking[1], and having already extended it to
>> semaphores, let's now add support for reader-writer semaphores (rwsems).
>>
>> The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
>> waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
>> detector can then query the rwsem's owner to identify the lock holder.
>>
>> Tracking works reliably for writers, as there can only be a single writer
>> holding the lock, and its task struct is stored in the owner field.
>>
>> The main challenge lies with readers. The owner field points to only one
>> of many concurrent readers, so we might lose track of the blocker if that
>> specific reader unlocks, even while others remain. This is not a
>> significant issue, however. In practice, long-lasting lock contention is
>> almost always caused by a writer. Therefore, reliably tracking the writer
>> is the primary goal of this patch series ;)
> 
> I think as far as it is reliable, it is better than nothing :) and that
> can help us to debug some part of kernel crashes.

Yeah! That was the goal. Even if it's not a perfect solution for all
reader scenarios, it provides much-needed visibility for writes and
is a good improvement ;)

> 
>>
>> With this change, the hung task detector can now show blocker task's info
>> like below:
>>
>> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
>> [Thu Jun 12 11:01:33 2025]       Tainted: G S         O        6.16.0-rc1 #1
>> [Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread2  state:D stack:0     pid:36526 tgid:36526 ppid:2      task_flags:0x208040 flags:0x00004000
>> [Thu Jun 12 11:01:33 2025] Call Trace:
>> [Thu Jun 12 11:01:33 2025]  <TASK>
>> [Thu Jun 12 11:01:33 2025]  __schedule+0x7c7/0x1930
>> [Thu Jun 12 11:01:33 2025]  ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
>> [Thu Jun 12 11:01:33 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  schedule+0x6a/0x180
>> [Thu Jun 12 11:01:33 2025]  schedule_preempt_disabled+0x15/0x30
>> [Thu Jun 12 11:01:33 2025]  rwsem_down_write_slowpath+0x447/0x1090
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_rwsem_down_write_slowpath+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx___might_resched+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025]  down_write+0x125/0x140
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_down_write+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? msleep+0x91/0xf0
>> [Thu Jun 12 11:01:33 2025]  ? __raw_spin_lock_irqsave+0x8c/0xf0
>> [Thu Jun 12 11:01:33 2025]  thread2_func+0x37/0x70 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025]  kthread+0x39f/0x750
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ret_from_fork+0x25d/0x320
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ret_from_fork_asm+0x1a/0x30
>> [Thu Jun 12 11:01:33 2025]  </TASK>
>> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
>> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread1  state:S stack:0     pid:36525 tgid:36525 ppid:2      task_flags:0x208040 flags:0x00004000
>> [Thu Jun 12 11:01:33 2025] Call Trace:
>> [Thu Jun 12 11:01:33 2025]  <TASK>
>> [Thu Jun 12 11:01:33 2025]  __schedule+0x7c7/0x1930
>> [Thu Jun 12 11:01:33 2025]  ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __mod_timer+0x304/0xa80
>> [Thu Jun 12 11:01:33 2025]  ? irq_work_queue+0x6a/0xa0
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_vprintk_emit+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  schedule+0x6a/0x180
>> [Thu Jun 12 11:01:33 2025]  schedule_timeout+0xfb/0x230
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_schedule_timeout+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_process_timeout+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? down_write+0xc4/0x140
>> [Thu Jun 12 11:01:33 2025]  msleep_interruptible+0xbe/0x150
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025]  thread1_func+0x37/0x60 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025]  kthread+0x39f/0x750
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ret_from_fork+0x25d/0x320
>> [Thu Jun 12 11:01:33 2025]  ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025]  ret_from_fork_asm+0x1a/0x30
>> [Thu Jun 12 11:01:33 2025]  </TASK>
>>
>> [1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
>>
>> Suggested-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
>> Signed-off-by: Lance Yang <lance.yang@...ux.dev>
> 
> Looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> 
> BTW, can you also add a patch to extend the test module
> (samples/hung_task/hung_task_tests.c) to handle the rwsem
> blocking case ? (reader and writer side)

Thanks! Will do.
Lance

> 
> Thank you,
> 
> 
>> ---
>>   include/linux/hung_task.h | 18 +++++++++---------
>>   kernel/hung_task.c        | 29 +++++++++++++++++++++++++----
>>   kernel/locking/rwsem.c    | 17 ++++++++++++++++-
>>   3 files changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h
>> index 1bc2b3244613..34e615c76ca5 100644
>> --- a/include/linux/hung_task.h
>> +++ b/include/linux/hung_task.h
>> @@ -21,17 +21,17 @@
>>    * type.
>>    *
>>    * Type encoding:
>> - * 00 - Blocked on mutex        (BLOCKER_TYPE_MUTEX)
>> - * 01 - Blocked on semaphore    (BLOCKER_TYPE_SEM)
>> - * 10 - Blocked on rt-mutex     (BLOCKER_TYPE_RTMUTEX)
>> - * 11 - Blocked on rw-semaphore (BLOCKER_TYPE_RWSEM)
>> + * 00 - Blocked on mutex			(BLOCKER_TYPE_MUTEX)
>> + * 01 - Blocked on semaphore			(BLOCKER_TYPE_SEM)
>> + * 10 - Blocked on rw-semaphore as READER	(BLOCKER_TYPE_RWSEM_READER)
>> + * 11 - Blocked on rw-semaphore as WRITER	(BLOCKER_TYPE_RWSEM_WRITER)
>>    */
>> -#define BLOCKER_TYPE_MUTEX      0x00UL
>> -#define BLOCKER_TYPE_SEM        0x01UL
>> -#define BLOCKER_TYPE_RTMUTEX    0x02UL
>> -#define BLOCKER_TYPE_RWSEM      0x03UL
>> +#define BLOCKER_TYPE_MUTEX		0x00UL
>> +#define BLOCKER_TYPE_SEM		0x01UL
>> +#define BLOCKER_TYPE_RWSEM_READER	0x02UL
>> +#define BLOCKER_TYPE_RWSEM_WRITER	0x03UL
>>   
>> -#define BLOCKER_TYPE_MASK       0x03UL
>> +#define BLOCKER_TYPE_MASK		0x03UL
>>   
>>   #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>>   static inline void hung_task_set_blocker(void *lock, unsigned long type)
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index d2432df2b905..8708a1205f82 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/sched/debug.h>
>>   #include <linux/sched/sysctl.h>
>>   #include <linux/hung_task.h>
>> +#include <linux/rwsem.h>
>>   
>>   #include <trace/events/sched.h>
>>   
>> @@ -100,6 +101,7 @@ static void debug_show_blocker(struct task_struct *task)
>>   {
>>   	struct task_struct *g, *t;
>>   	unsigned long owner, blocker, blocker_type;
>> +	const char *rwsem_blocked_by, *rwsem_blocked_as;
>>   
>>   	RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
>>   
>> @@ -111,12 +113,20 @@ static void debug_show_blocker(struct task_struct *task)
>>   
>>   	switch (blocker_type) {
>>   	case BLOCKER_TYPE_MUTEX:
>> -		owner = mutex_get_owner(
>> -			(struct mutex *)hung_task_blocker_to_lock(blocker));
>> +		owner = mutex_get_owner(hung_task_blocker_to_lock(blocker));
>>   		break;
>>   	case BLOCKER_TYPE_SEM:
>> -		owner = sem_last_holder(
>> -			(struct semaphore *)hung_task_blocker_to_lock(blocker));
>> +		owner = sem_last_holder(hung_task_blocker_to_lock(blocker));
>> +		break;
>> +	case BLOCKER_TYPE_RWSEM_READER:
>> +	case BLOCKER_TYPE_RWSEM_WRITER:
>> +		owner = (unsigned long)rwsem_owner(
>> +					hung_task_blocker_to_lock(blocker));
>> +		rwsem_blocked_as = (blocker_type == BLOCKER_TYPE_RWSEM_READER) ?
>> +					"reader" : "writer";
>> +		rwsem_blocked_by = is_rwsem_reader_owned(
>> +					hung_task_blocker_to_lock(blocker)) ?
>> +					"reader" : "writer";
>>   		break;
>>   	default:
>>   		WARN_ON_ONCE(1);
>> @@ -134,6 +144,11 @@ static void debug_show_blocker(struct task_struct *task)
>>   			pr_err("INFO: task %s:%d is blocked on a semaphore, but the last holder is not found.\n",
>>   			       task->comm, task->pid);
>>   			break;
>> +		case BLOCKER_TYPE_RWSEM_READER:
>> +		case BLOCKER_TYPE_RWSEM_WRITER:
>> +			pr_err("INFO: task %s:%d is blocked on an rw-semaphore, but the owner is not found.\n",
>> +			       task->comm, task->pid);
>> +			break;
>>   		}
>>   		return;
>>   	}
>> @@ -152,6 +167,12 @@ static void debug_show_blocker(struct task_struct *task)
>>   			pr_err("INFO: task %s:%d blocked on a semaphore likely last held by task %s:%d\n",
>>   			       task->comm, task->pid, t->comm, t->pid);
>>   			break;
>> +		case BLOCKER_TYPE_RWSEM_READER:
>> +		case BLOCKER_TYPE_RWSEM_WRITER:
>> +			pr_err("INFO: task %s:%d <%s> blocked on an rw-semaphore likely owned by task %s:%d <%s>\n",
>> +			       task->comm, task->pid, rwsem_blocked_as, t->comm,
>> +			       t->pid, rwsem_blocked_by);
>> +			break;
>>   		}
>>   		sched_show_task(t);
>>   		return;
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index a310eb9896de..92c6332da401 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/export.h>
>>   #include <linux/rwsem.h>
>>   #include <linux/atomic.h>
>> +#include <linux/hung_task.h>
>>   #include <trace/events/lock.h>
>>   
>>   #ifndef CONFIG_PREEMPT_RT
>> @@ -1065,10 +1066,13 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>>   		wake_up_q(&wake_q);
>>   
>>   	trace_contention_begin(sem, LCB_F_READ);
>> +	set_current_state(state);
>> +
>> +	if (state == TASK_UNINTERRUPTIBLE)
>> +		hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER);
>>   
>>   	/* wait to be given the lock */
>>   	for (;;) {
>> -		set_current_state(state);
>>   		if (!smp_load_acquire(&waiter.task)) {
>>   			/* Matches rwsem_mark_wake()'s smp_store_release(). */
>>   			break;
>> @@ -1083,8 +1087,12 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>>   		}
>>   		schedule_preempt_disabled();
>>   		lockevent_inc(rwsem_sleep_reader);
>> +		set_current_state(state);
>>   	}
>>   
>> +	if (state == TASK_UNINTERRUPTIBLE)
>> +		hung_task_clear_blocker();
>> +
>>   	__set_current_state(TASK_RUNNING);
>>   	lockevent_inc(rwsem_rlock);
>>   	trace_contention_end(sem, 0);
>> @@ -1146,6 +1154,9 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>>   	set_current_state(state);
>>   	trace_contention_begin(sem, LCB_F_WRITE);
>>   
>> +	if (state == TASK_UNINTERRUPTIBLE)
>> +		hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER);
>> +
>>   	for (;;) {
>>   		if (rwsem_try_write_lock(sem, &waiter)) {
>>   			/* rwsem_try_write_lock() implies ACQUIRE on success */
>> @@ -1179,6 +1190,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>>   trylock_again:
>>   		raw_spin_lock_irq(&sem->wait_lock);
>>   	}
>> +
>> +	if (state == TASK_UNINTERRUPTIBLE)
>> +		hung_task_clear_blocker();
>> +
>>   	__set_current_state(TASK_RUNNING);
>>   	raw_spin_unlock_irq(&sem->wait_lock);
>>   	lockevent_inc(rwsem_wlock);
>> -- 
>> 2.49.0
>>
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ