linux-kernel - Re: [v7 PATCH 1/2] hung_task: Refactor detection logic and atomicise detection count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <nbnqjuu4uzrfgy5swllu45xab6xtyhtdf3f272frz2ir66c7gq@34tokedhi2cx>
Date: Wed, 4 Feb 2026 09:07:36 -0500
From: Aaron Tomlin <atomlin@...mlin.com>
To: Petr Mladek <pmladek@...e.com>
Cc: Lance Yang <lance.yang@...ux.dev>, neelx@...e.com, sean@...e.io, 
	akpm@...ux-foundation.org, mproche@...il.com, chjohnst@...il.com, nick.lange@...il.com, 
	linux-kernel@...r.kernel.org, mhiramat@...nel.org, joel.granados@...nel.org, 
	gregkh@...uxfoundation.org
Subject: Re: [v7 PATCH 1/2] hung_task: Refactor detection logic and atomicise
 detection count

On Tue, Feb 03, 2026 at 10:03:52AM +0100, Petr Mladek wrote:
> On Tue 2026-02-03 11:08:33, Lance Yang wrote:
> > On 2026/2/3 11:05, Lance Yang wrote:
> > > On 2026/1/25 21:58, Aaron Tomlin wrote:
> > > > The check_hung_task() function currently conflates two distinct
> > > > responsibilities: validating whether a task is hung and handling the
> > > > subsequent reporting (printing warnings, triggering panics, or
> > > > tracepoints).
> > > > 
> > > > This patch refactors the logic by introducing hung_task_info(), a
> > > > function dedicated solely to reporting. The actual detection check,
> > > > task_is_hung(), is hoisted into the primary loop within
> > > > check_hung_uninterruptible_tasks(). This separation clearly decouples
> > > > the mechanism of detection from the policy of reporting.
> > > > 
> > > > Furthermore, to facilitate future support for concurrent hung task
> > > > detection, the global sysctl_hung_task_detect_count variable is
> > > > converted from unsigned long to atomic_long_t. Consequently, the
> > > > counting logic is updated to accumulate the number of hung tasks locally
> > > > (this_round_count) during the iteration. The global counter is then
> > > > updated atomically via atomic_long_cmpxchg_relaxed() once the loop
> > > > concludes, rather than incrementally during the scan.
> > > > 
> > > > These changes are strictly preparatory and introduce no functional
> > > > change to the system's runtime behaviour.
> > > > 
> > > > Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> > > > ---
> > > >   kernel/hung_task.c | 58 ++++++++++++++++++++++++++--------------------
> > > >   1 file changed, 33 insertions(+), 25 deletions(-)
> > > > 
> > > > diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> > > > index d2254c91450b..df10830ed9ef 100644
> > > > --- a/kernel/hung_task.c
> > > > +++ b/kernel/hung_task.c
> > > > @@ -36,7 +36,7 @@ static int __read_mostly
> > > > sysctl_hung_task_check_count = PID_MAX_LIMIT;
> > > >   /*
> > > >    * Total number of tasks detected as hung since boot:
> > > >    */
> > > > -static unsigned long __read_mostly sysctl_hung_task_detect_count;
> > > > +static atomic_long_t sysctl_hung_task_detect_count =
> > > > ATOMIC_LONG_INIT(0);
> > > >   /*
> > > >    * Limit number of tasks checked in a batch.
> > > > @@ -223,31 +223,29 @@ static inline void debug_show_blocker(struct
> > > > task_struct *task, unsigned long ti
> > > >   }
> > > >   #endif
> > > > -static void check_hung_task(struct task_struct *t, unsigned long
> > > > timeout,
> > > > -        unsigned long prev_detect_count)
> > > > +/**
> > > > + * hung_task_info - Print diagnostic details for a hung task
> > > > + * @t: Pointer to the detected hung task.
> > > > + * @timeout: Timeout threshold for detecting hung tasks
> > > > + * @this_round_count: Count of hung tasks detected in the current
> > > > iteration
> > > > + *
> > > > + * Print structured information about the specified hung task, if
> > > > warnings
> > > > + * are enabled or if the panic batch threshold is exceeded.
> > > > + */
> > > > +static void hung_task_info(struct task_struct *t, unsigned long timeout,
> > > > +               unsigned long this_round_count)
> > > >   {
> > > > -    unsigned long total_hung_task;
> > > > -
> > > > -    if (!task_is_hung(t, timeout))
> > > > -        return;
> > > > -
> > > > -    /*
> > > > -     * This counter tracks the total number of tasks detected as hung
> > > > -     * since boot.
> > > > -     */
> > > > -    sysctl_hung_task_detect_count++;
> > > 
> > > Previously, the global detect count updated immediately when a hung task
> > > was found. BUT now, it only updates after the full scan finishes ...
> > > 
> > > Ideally, the count should update as soon as possible, so that userspace
> > > can react in time :)
> > > 
> > > For example, by migrating critical containers away from the node before
> > > the situation gets worse - something we already do.
> > 
> > Sorry, I should have said that earlier - just realized it ...
> 
> Better late then sorry ;-)
> 
> That said, is the delay really critical? I guess that the userspace
> checks the counter in regular intervals (seconds or tens of seconds).
> Or is there any way to get a notification immediately?
> 
> Anyway, I thought how the counting and barriers might work when
> we update the global counter immediately. And I came up with
> the following:
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 350093de0535..8bc043fbe89c 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -302,15 +302,10 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  	int max_count = sysctl_hung_task_check_count;
>  	unsigned long last_break = jiffies;
>  	struct task_struct *g, *t;
> -	unsigned long total_count, this_round_count;
> +	unsigned long this_round_count;
>  	int need_warning = sysctl_hung_task_warnings;
>  	unsigned long si_mask = hung_task_si_mask;
>  
> -	/*
> -	 * The counter might get reset. Remember the initial value.
> -	 * Acquire prevents reordering task checks before this point.
> -	 */
> -	total_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count);
>  	/*
>  	 * If the system crashed already then all bets are off,
>  	 * do not report extra hung tasks:
> @@ -330,6 +325,13 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  		}
>  
>  		if (task_is_hung(t, timeout)) {
> +			/*
> +			 * Increment the global counter so that userspace could
> +			 * start migrating tasks ASAP. But count the current
> +			 * round separately because userspace could reset
> +			 * the global counter at any time.
> +			 */
> +			atomic_long_inc(&sysctl_hung_task_detect_count);
>  			this_round_count++;
>  			hung_task_info(t, timeout, this_round_count);
>  		}
> @@ -340,15 +342,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  	if (!this_round_count)
>  		return;
>  
> -	/*
> -	 * Do not count this round when the global counter has been reset
> -	 * during this check. Release ensures we see all hang details
> -	 * recorded during the scan.
> -	 */
> -	atomic_long_cmpxchg_release(&sysctl_hung_task_detect_count,
> -				    total_count, total_count +
> -				    this_round_count);
> -
>  	if (need_warning || hung_task_call_panic) {
>  		si_mask |= SYS_INFO_LOCKS;
>  
> 
> I am not sure of the comment above the increment is needed.
> Well, it might help anyone to understand the motivation without
> digging in the git log history.

Hi Petr,

Agreed.

By moving to a "relaxed" atomic_long_inc(), one now will rely on the
atomicity of the individual operation. If a user resets the counter
(writes 0) concurrently:

        CPU 0                               CPU 1
khungtaskd increments (1)
                                        User resets (0)
khungtaskd increments (1)


In the above, the counter reflects 1. This is acceptable behavior for a
"live" counter. The strict protection against "lost updates" required by
the batch calculation (i.e., old + new) is not required for a simple atomic
increment.


Kind regards,
-- 
Aaron Tomlin

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)