lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYr9GMxilqpNp1ig@pathway.suse.cz>
Date: Tue, 10 Feb 2026 10:40:40 +0100
From: Petr Mladek <pmladek@...e.com>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: akpm@...ux-foundation.org, lance.yang@...ux.dev, mhiramat@...nel.org,
	gregkh@...uxfoundation.org, neelx@...e.com, sean@...e.io,
	mproche@...il.com, chjohnst@...il.com, nick.lange@...il.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] hung_task: Explicitly report I/O wait state in log
 output

On Mon 2026-02-09 09:30:34, Aaron Tomlin wrote:
> On Mon, Feb 09, 2026 at 11:14:27AM +0100, Petr Mladek wrote:
> > > Accessing in_iowait is safe in this context. The detector holds
> > > rcu_read_lock() within check_hung_uninterruptible_tasks(), ensuring the
> > > task structure remains valid in memory.
> > 
> > This is true.
> > 
> > > Furthermore, as the task is
> > > confirmed to be in a persistent TASK_UNINTERRUPTIBLE state, it cannot
> > > modify its own in_iowait flag, rendering the read operation stable and
> > > free from data races.
> > 
> > IMHO, this is not true. The blocked tasks might wake up at any time.
> > There is a small chance to print an inconsistent information.
> > But we could live with it. The entire hung task report is racy.
> > 
> > The race would actually be a lucky moment where the task get
> > unblocked. In this case, it won't be reported in the next
> > round...
> > 
> > I suggest to omit the entire paragraph.
> > 
> > > Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > > Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> > 
> > Othrewise, it looks good.
> > 
> > With the removed paragraph:
> > 
> > Reviewed-by: Petr Mladek <pmladek@...e.com>
> 
> Hi Petr, Lance,
> 
> Apologies. The old commit message was erroneously used.
> 
> I would prefer to replace the last paragraph with the following to reflect
> that this is a diagnostic snapshot only:
> 
>     Theoretically, io_schedule_finish() could be called immediately after
>     khungtaskd checks the flag, rendering the output stale. However,
>     strictly preventing this would require blocking the waking task (e.g.,
>     within the mutex unlock code path - see mutex_lock_io()) to inhibit the
>     state change during the scan. This seems entirely disproportionate for
>     a "best effort" diagnostic tool, especially given the probability of
>     such a race is negligible after a long hang, by default.
> 
> Please let me know your thoughts.

Strictly speaking, io_schedule_finish() might be called even before
khungtaskd checks the flag. It can be called in parallel at any time.
The entire khungtaskd out is racy. But it is good enough in practice,
especially because it reports stalls. The chance that a long stall
gets finished while the report is being printed is very small.
And you are right. Any attempt to serialize the output seems
disproportionate here. It might make the stall even worse.

If you really want to write something about the possible race
then I would write something like:

  Theoretically, io_schedule_finish() could be called in parallel
  and the read flag need not match the later printed backtrace.
  It should be acceptable in practice. The entire report is
  racy. And it works most of the time, especially because
  long stalls are being reported. And adding any extra
  synchronization seems entirely disproportionate in this
  scenario.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ