[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5eec3a75-43e5-4bfd-a303-a89f5d121be6@linux.dev>
Date: Wed, 28 Jan 2026 16:26:21 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
Aaron Tomlin <atomlin@...mlin.com>
Cc: akpm@...ux-foundation.org, gregkh@...uxfoundation.org, pmladek@...e.com,
joel.granados@...nel.org, neelx@...e.com, sean@...e.io, mproche@...il.com,
chjohnst@...il.com, nick.lange@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hung_task: Differentiate between I/O and Lock/Resource
waits
On 2026/1/28 16:17, Masami Hiramatsu (Google) wrote:
> On Sun, 25 Jan 2026 15:39:05 -0500
> Aaron Tomlin <atomlin@...mlin.com> wrote:
>
>> Currently, the hung task reporting mechanism does not differentiate
>> between the underlying causes of a D state, labelling all such tasks
>> merely as "blocked". Consequently, administrators must perform manual
>> stack trace inspection to ascertain if the delay stems from an I/O wait
>> (indicative of hardware or filesystem issues) or a lock wait (indicative
>> of software contention).
>>
>> This change utilises the in_iowait field from struct task_struct to
>> distinguish between two distinct failure modes in the log output:
>>
>> 1. D state "Disk I/O": The task is waiting in io_schedule().
>> This typically implies a storage device, filesystem, or
>> network filesystem (e.g., NFS) is unresponsive.
>>
>> 2. D state "Lock/Resource": The task is waiting on a kernel
>> primitive (e.g., mutex). This typically implies a software
>> bug, deadlock, or resource starvation.
>>
>> It is safe to read in_iowait in this manner because
>> check_hung_uninterruptible_tasks() holds the RCU read lock, preserving
>> the task structure. Moreover, the task is effectively quiescent (in a
>> persistent TASK_UNINTERRUPTIBLE state) and thus cannot update its own
>> in_iowait status, guaranteeing a stable, race-free value.
>>
>> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
>> ---
>> kernel/hung_task.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 350093de0535..608731c7ccba 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -250,8 +250,9 @@ static void hung_task_info(struct task_struct *t, unsigned long timeout,
>> if (sysctl_hung_task_warnings || hung_task_call_panic) {
>> if (sysctl_hung_task_warnings > 0)
>> sysctl_hung_task_warnings--;
>> - pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
>> - t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
>> + pr_err("INFO: task %s:%d blocked in %s state for more than %ld seconds.\n",
>> + t->comm, t->pid, t->in_iowait ? "D (Disk I/O)" : "D (Lock/Resource)",
>
> If this is only for human readability, I rather like just adding
> "in iowait" at the end. "D" state seems redundant, and "Lock/Resource"
> can mislead. What about something like below?
>
> pr_err("INFO: task %s:%d blocked for more than %ld seconds%s.\n",
> ..., t->in_iowait ? " in I/O wait" : "");
That would be better, looks good to me ;)
Powered by blists - more mailing lists