[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260128171749.f46ee8511deebe034b11171b@kernel.org>
Date: Wed, 28 Jan 2026 17:17:49 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: akpm@...ux-foundation.org, lance.yang@...ux.dev,
gregkh@...uxfoundation.org, pmladek@...e.com, joel.granados@...nel.org,
neelx@...e.com, sean@...e.io, mproche@...il.com, chjohnst@...il.com,
nick.lange@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hung_task: Differentiate between I/O and Lock/Resource
waits
On Sun, 25 Jan 2026 15:39:05 -0500
Aaron Tomlin <atomlin@...mlin.com> wrote:
> Currently, the hung task reporting mechanism does not differentiate
> between the underlying causes of a D state, labelling all such tasks
> merely as "blocked". Consequently, administrators must perform manual
> stack trace inspection to ascertain if the delay stems from an I/O wait
> (indicative of hardware or filesystem issues) or a lock wait (indicative
> of software contention).
>
> This change utilises the in_iowait field from struct task_struct to
> distinguish between two distinct failure modes in the log output:
>
> 1. D state "Disk I/O": The task is waiting in io_schedule().
> This typically implies a storage device, filesystem, or
> network filesystem (e.g., NFS) is unresponsive.
>
> 2. D state "Lock/Resource": The task is waiting on a kernel
> primitive (e.g., mutex). This typically implies a software
> bug, deadlock, or resource starvation.
>
> It is safe to read in_iowait in this manner because
> check_hung_uninterruptible_tasks() holds the RCU read lock, preserving
> the task structure. Moreover, the task is effectively quiescent (in a
> persistent TASK_UNINTERRUPTIBLE state) and thus cannot update its own
> in_iowait status, guaranteeing a stable, race-free value.
>
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
> kernel/hung_task.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 350093de0535..608731c7ccba 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -250,8 +250,9 @@ static void hung_task_info(struct task_struct *t, unsigned long timeout,
> if (sysctl_hung_task_warnings || hung_task_call_panic) {
> if (sysctl_hung_task_warnings > 0)
> sysctl_hung_task_warnings--;
> - pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> - t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> + pr_err("INFO: task %s:%d blocked in %s state for more than %ld seconds.\n",
> + t->comm, t->pid, t->in_iowait ? "D (Disk I/O)" : "D (Lock/Resource)",
If this is only for human readability, I rather like just adding
"in iowait" at the end. "D" state seems redundant, and "Lock/Resource"
can mislead. What about something like below?
pr_err("INFO: task %s:%d blocked for more than %ld seconds%s.\n",
..., t->in_iowait ? " in I/O wait" : "");
Thank you,
> + (jiffies - t->last_switch_time) / HZ);
> pr_err(" %s %s %.*s\n",
> print_tainted(), init_utsname()->release,
> (int)strcspn(init_utsname()->version, " "),
> --
> 2.51.0
>
--
Masami Hiramatsu (Google) <mhiramat@...nel.org>
Powered by blists - more mailing lists