[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d15cf75-abbd-446d-86fa-49ea251f7a82@linux.dev>
Date: Mon, 21 Jul 2025 12:56:25 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Ye Liu <ye.liu@...ux.dev>
Cc: Ye Liu <liuye@...inos.cn>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, Zi Li <zi.li@...ux.dev>
Subject: Re: [PATCH] hung_task: add warning counter to blocked task report
Hi Ye,
Thanks for your patch!
On 2025/7/21 11:17, Ye Liu wrote:
> From: Ye Liu <liuye@...inos.cn>
>
> Add a warning counter to each hung task message to make it easier
> to analyze and locate issues in the logs.
>
> Signed-off-by: Ye Liu <liuye@...inos.cn>
> ---
> kernel/hung_task.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 8708a1205f82..9e5f86148d47 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -58,6 +58,7 @@ EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);
> static unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
>
> static int __read_mostly sysctl_hung_task_warnings = 10;
> +static int hung_task_warning_count;
>
> static int __read_mostly did_panic;
> static bool hung_task_show_lock;
> @@ -232,8 +233,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> if (sysctl_hung_task_warnings || hung_task_call_panic) {
> if (sysctl_hung_task_warnings > 0)
> sysctl_hung_task_warnings--;
> - pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> - t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> + pr_err("INFO: task %s:%d blocked for more than %ld seconds. [Warning #%d]\n",
> + t->comm, t->pid, (jiffies - t->last_switch_time) / HZ,
> + ++hung_task_warning_count);
> pr_err(" %s %s %.*s\n",
> print_tainted(), init_utsname()->release,
> (int)strcspn(init_utsname()->version, " "),
A quick thought on this: we already have the hung_task_detect_count
counter, which tracks the total number of hung tasks detected since
boot ;)
While this patch adds a counter inline with the warning message, the
existing counter already provides a way to know how many hung task
events have occurred.
Could you elaborate on the specific benefit of printing this count
directly in the log, compared to checking the global hung_task_detect_count?
Also, if the goal is to give each warning a unique sequence number,
I think the dmesg timestamp prefix serves the same purpose ;)
Thanks,
Lance
Powered by blists - more mailing lists