[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYr9GMxilqpNp1ig@pathway.suse.cz>
Date: Tue, 10 Feb 2026 10:40:40 +0100
From: Petr Mladek <pmladek@...e.com>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: akpm@...ux-foundation.org, lance.yang@...ux.dev, mhiramat@...nel.org,
gregkh@...uxfoundation.org, neelx@...e.com, sean@...e.io,
mproche@...il.com, chjohnst@...il.com, nick.lange@...il.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] hung_task: Explicitly report I/O wait state in log
output
On Mon 2026-02-09 09:30:34, Aaron Tomlin wrote:
> On Mon, Feb 09, 2026 at 11:14:27AM +0100, Petr Mladek wrote:
> > > Accessing in_iowait is safe in this context. The detector holds
> > > rcu_read_lock() within check_hung_uninterruptible_tasks(), ensuring the
> > > task structure remains valid in memory.
> >
> > This is true.
> >
> > > Furthermore, as the task is
> > > confirmed to be in a persistent TASK_UNINTERRUPTIBLE state, it cannot
> > > modify its own in_iowait flag, rendering the read operation stable and
> > > free from data races.
> >
> > IMHO, this is not true. The blocked tasks might wake up at any time.
> > There is a small chance to print an inconsistent information.
> > But we could live with it. The entire hung task report is racy.
> >
> > The race would actually be a lucky moment where the task get
> > unblocked. In this case, it won't be reported in the next
> > round...
> >
> > I suggest to omit the entire paragraph.
> >
> > > Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > > Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> >
> > Othrewise, it looks good.
> >
> > With the removed paragraph:
> >
> > Reviewed-by: Petr Mladek <pmladek@...e.com>
>
> Hi Petr, Lance,
>
> Apologies. The old commit message was erroneously used.
>
> I would prefer to replace the last paragraph with the following to reflect
> that this is a diagnostic snapshot only:
>
> Theoretically, io_schedule_finish() could be called immediately after
> khungtaskd checks the flag, rendering the output stale. However,
> strictly preventing this would require blocking the waking task (e.g.,
> within the mutex unlock code path - see mutex_lock_io()) to inhibit the
> state change during the scan. This seems entirely disproportionate for
> a "best effort" diagnostic tool, especially given the probability of
> such a race is negligible after a long hang, by default.
>
> Please let me know your thoughts.
Strictly speaking, io_schedule_finish() might be called even before
khungtaskd checks the flag. It can be called in parallel at any time.
The entire khungtaskd out is racy. But it is good enough in practice,
especially because it reports stalls. The chance that a long stall
gets finished while the report is being printed is very small.
And you are right. Any attempt to serialize the output seems
disproportionate here. It might make the stall even worse.
If you really want to write something about the possible race
then I would write something like:
Theoretically, io_schedule_finish() could be called in parallel
and the read flag need not match the later printed backtrace.
It should be acceptable in practice. The entire report is
racy. And it works most of the time, especially because
long stalls are being reported. And adding any extra
synchronization seems entirely disproportionate in this
scenario.
Best Regards,
Petr
Powered by blists - more mailing lists