[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3773726f-f060-47a7-b99d-66ab0aa0d8b5@linux.dev>
Date: Wed, 17 Dec 2025 21:36:56 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Petr Mladek <pmladek@...e.com>
Cc: Aaron Tomlin <atomlin@...mlin.com>, sean@...e.io,
linux-kernel@...r.kernel.org, mhiramat@...nel.org,
akpm@...ux-foundation.org, gregkh@...uxfoundation.org
Subject: Re: [PATCH v3 2/2] hung_task: Enable runtime reset of
hung_task_detect_count
On 2025/12/17 21:09, Petr Mladek wrote:
> On Wed 2025-12-17 15:31:25, Lance Yang wrote:
>> On 2025/12/16 11:00, Aaron Tomlin wrote:
>>> Introduce support for writing to /proc/sys/kernel/hung_task_detect_count.
>>>
>>> Writing any value to this file atomically resets the counter of detected
>>> hung tasks to zero. This grants system administrators the ability to clear
>>> the cumulative diagnostic history after resolving an incident, simplifying
>>> monitoring without requiring a system restart.
>>
>>> --- a/Documentation/admin-guide/sysctl/kernel.rst
>>> +++ b/Documentation/admin-guide/sysctl/kernel.rst
>>> @@ -418,7 +418,7 @@ hung_task_detect_count
>>> ======================
>>> Indicates the total number of tasks that have been detected as hung since
>>> -the system boot.
>>> +the system boot. The counter can be reset to zero when written to.
>>> This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
>>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>>> index 5902573200c0..01ce46a107b0 100644
>>> --- a/kernel/hung_task.c
>>> +++ b/kernel/hung_task.c
>>> @@ -375,6 +375,31 @@ static long hung_timeout_jiffies(unsigned long last_checked,
>>> }
>>> #ifdef CONFIG_SYSCTL
>>> +
>>> +/**
>>> + * proc_dohung_task_detect_count - proc handler for hung_task_detect_count
>>> + * @table: Pointer to the struct ctl_table definition for this proc entry
>>> + * @write: Flag indicating the operation
>>> + * @buffer: User space buffer for data transfer
>>> + * @lenp: Pointer to the length of the data being transferred
>>> + * @ppos: Pointer to the current file offset
>>> + *
>>> + * This handler is used for reading the current hung task detection count
>>> + * and for resetting it to zero when a write operation is performed.
>>> + * Returns 0 on success or a negative error code on failure.
>>> + */
>>> +static int proc_dohung_task_detect_count(const struct ctl_table *table, int write,
>>> + void *buffer, size_t *lenp, loff_t *ppos)
>>> +{
>>> + if (!write)
>>> + return proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
>>> +
>>> + WRITE_ONCE(sysctl_hung_task_detect_count, 0);
>>
>> The reset uses WRITE_ONCE() but the increment in check_hung_task()
>> is plain ++, it's just a stat cunter so fine though ;)
>
> I was about to wave this away as well. But I am afraid that
> it is even more complicated.
Good catch!
>
> The counter is used to decide how many hung tasks were found in
> by a single check, see:
>
> static void check_hung_uninterruptible_tasks(unsigned long timeout)
> {
> [...]
> unsigned long prev_detect_count = sysctl_hung_task_detect_count;
> [...]
> for_each_process_thread(g, t) {
> [...]
> check_hung_task(t, timeout, prev_detect_count);
> }
> [...]
> if (!(sysctl_hung_task_detect_count - prev_detect_count))
> return;
>
> if (need_warning || hung_task_call_panic) {
> si_mask |= SYS_INFO_LOCKS;
>
> if (sysctl_hung_task_all_cpu_backtrace)
> si_mask |= SYS_INFO_ALL_BT;
> }
>
> sys_info(si_mask);
>
> if (hung_task_call_panic)
> panic("hung_task: blocked tasks");
> }
>
> Any race might cause false positives and even panic() !!!
> And the race window is rather big (checking all processes).
Right, I completely missed the dependency on prev_detect_count.
Messing with that counter while the loop is running is really
dangerous ...
>
> This race can't be prevented by an atomic read/write.
> IMHO, the only solution would be to add some locking,
> e.g. mutex and take it around the entire
> check_hung_uninterruptible_tasks().
>
> On one hand, it might work. The code is called from
> a task context...
>
> But adding a lock into a lockup-detector code triggers
> some warning bells in my head.
>
> Honestly, I am not sure if it is worth it. I understand
> the motivation for the reset. But IMHO, even more important
> is to make sure that the watchdog works as expected.
Given that adding a lock to the detector path is undesirable,
the risk clearly outweighs the benefit :)
Thanks for saving me from this pitfall, Petr!
Powered by blists - more mailing lists