[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8FgGev2HPX/ksUS@alley>
Date: Fri, 13 Jan 2023 14:43:53 +0100
From: Petr Mladek <pmladek@...e.com>
To: akpm@...ux-foundation.org, peterz@...radead.org,
linux-kernel@...r.kernel.org, zwp10758@...il.com
Subject: Re: [RFC PATCH] hung_task: show sysctl_hung_task_warnings
On Thu 2023-01-12 17:17:45, Weiping Zhang wrote:
> This patch try to add more debug info to detect lost kernel log or no
> hung task was detected.
>
> The user set 10 to the hung_task_timeout_secs, the kernel log:
>
> [ 3942.642220] INFO: task mount:19066 blocked for more than 10 seconds.
> [ 3952.876768] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
> [ 3952.877088] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
> [ 3952.878212] INFO: task mount:19066 blocked for more than 10 seconds.
> [ 3963.116805] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
> [ 3963.117137] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
> [ 3963.118275] INFO: task mount:19066 blocked for more than 10 seconds.
> [ 3973.356837] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
> [ 3973.357148] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
> [ 3973.358247] INFO: task mount:19066 blocked for more than 10 seconds.
> [ 3993.836899] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
> [ 3993.837238] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
> [ 3993.838356] INFO: task mount:19066 blocked for more than 10 seconds.
>
> There is no any log at about 3983, it's hard to know if kernel log was
> lost or there is no hung task was detected at that moment. So this patch
> print sysctl_hung_task_warnings to distinguish the above two cases.
>
> Signed-off-by: Weiping Zhang <zhangweiping@...iglobal.com>
> ---
> kernel/hung_task.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index c71889f3f3fc..ca917931473d 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -127,8 +127,11 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> * complain:
> */
> if (sysctl_hung_task_warnings) {
> - if (sysctl_hung_task_warnings > 0)
> + if (sysctl_hung_task_warnings > 0) {
> sysctl_hung_task_warnings--;
> + pr_err("sysctl_hung_task_warnings: %d\n",
> + sysctl_hung_task_warnings);
> + }
It is too much noise. But it might make sense to report it when
the counter gets down to zero. Something like:
if (sysctl_hung_task_warnings)
pr_info("Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings\n");
and move this down after printing all the details for this hung task report.
> pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> pr_err(" %s %s %.*s\n",
Best Regards,
Petr
Powered by blists - more mailing lists