[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <qhlqu7midrcrzug6rugsz555u22x7mnvhc7bqazdojy5aauw35@lbdviezrvqb3>
Date: Mon, 9 Feb 2026 09:30:34 -0500
From: Aaron Tomlin <atomlin@...mlin.com>
To: Petr Mladek <pmladek@...e.com>
Cc: akpm@...ux-foundation.org, lance.yang@...ux.dev, mhiramat@...nel.org,
gregkh@...uxfoundation.org, neelx@...e.com, sean@...e.io, mproche@...il.com,
chjohnst@...il.com, nick.lange@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] hung_task: Explicitly report I/O wait state in log
output
On Mon, Feb 09, 2026 at 11:14:27AM +0100, Petr Mladek wrote:
> > Accessing in_iowait is safe in this context. The detector holds
> > rcu_read_lock() within check_hung_uninterruptible_tasks(), ensuring the
> > task structure remains valid in memory.
>
> This is true.
>
> > Furthermore, as the task is
> > confirmed to be in a persistent TASK_UNINTERRUPTIBLE state, it cannot
> > modify its own in_iowait flag, rendering the read operation stable and
> > free from data races.
>
> IMHO, this is not true. The blocked tasks might wake up at any time.
> There is a small chance to print an inconsistent information.
> But we could live with it. The entire hung task report is racy.
>
> The race would actually be a lucky moment where the task get
> unblocked. In this case, it won't be reported in the next
> round...
>
> I suggest to omit the entire paragraph.
>
> > Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
>
> Othrewise, it looks good.
>
> With the removed paragraph:
>
> Reviewed-by: Petr Mladek <pmladek@...e.com>
Hi Petr, Lance,
Apologies. The old commit message was erroneously used.
I would prefer to replace the last paragraph with the following to reflect
that this is a diagnostic snapshot only:
Theoretically, io_schedule_finish() could be called immediately after
khungtaskd checks the flag, rendering the output stale. However,
strictly preventing this would require blocking the waking task (e.g.,
within the mutex unlock code path - see mutex_lock_io()) to inhibit the
state change during the scan. This seems entirely disproportionate for
a "best effort" diagnostic tool, especially given the probability of
such a race is negligible after a long hang, by default.
Please let me know your thoughts.
Kind regards,
--
Aaron Tomlin
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists