[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79da6790-b972-417b-8004-bfb4084af82a@linux.dev>
Date: Mon, 26 Jan 2026 10:30:22 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: neelx@...e.com, sean@...e.io, pmladek@...e.com,
gregkh@...uxfoundation.org, joel.granados@...nel.org, mproche@...il.com,
chjohnst@...il.com, nick.lange@...il.com, linux-kernel@...r.kernel.org,
mhiramat@...nel.org, akpm@...ux-foundation.org
Subject: Re: [PATCH] hung_task: Differentiate between I/O and Lock/Resource
waits
On 2026/1/26 04:39, Aaron Tomlin wrote:
> Currently, the hung task reporting mechanism does not differentiate
> between the underlying causes of a D state, labelling all such tasks
> merely as "blocked". Consequently, administrators must perform manual
> stack trace inspection to ascertain if the delay stems from an I/O wait
> (indicative of hardware or filesystem issues) or a lock wait (indicative
> of software contention).
>
> This change utilises the in_iowait field from struct task_struct to
> distinguish between two distinct failure modes in the log output:
>
> 1. D state "Disk I/O": The task is waiting in io_schedule().
> This typically implies a storage device, filesystem, or
> network filesystem (e.g., NFS) is unresponsive.
>
> 2. D state "Lock/Resource": The task is waiting on a kernel
> primitive (e.g., mutex). This typically implies a software
> bug, deadlock, or resource starvation.
>
> It is safe to read in_iowait in this manner because
> check_hung_uninterruptible_tasks() holds the RCU read lock, preserving
> the task structure. Moreover, the task is effectively quiescent (in a
> persistent TASK_UNINTERRUPTIBLE state) and thus cannot update its own
> in_iowait status, guaranteeing a stable, race-free value.
>
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
> kernel/hung_task.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 350093de0535..608731c7ccba 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -250,8 +250,9 @@ static void hung_task_info(struct task_struct *t, unsigned long timeout,
> if (sysctl_hung_task_warnings || hung_task_call_panic) {
> if (sysctl_hung_task_warnings > 0)
> sysctl_hung_task_warnings--;
> - pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> - t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> + pr_err("INFO: task %s:%d blocked in %s state for more than %ld seconds.\n",
> + t->comm, t->pid, t->in_iowait ? "D (Disk I/O)" : "D (Lock/Resource)",
> + (jiffies - t->last_switch_time) / HZ);
> pr_err(" %s %s %.*s\n",
> print_tainted(), init_utsname()->release,
> (int)strcspn(init_utsname()->version, " "),
Why do we need this?
It's rather obvious that the stack trace already shows whether it
is in "D (Disk I/O)" or "D (Lock/Resource)" or "D (...)".
Powered by blists - more mailing lists