linux-kernel - Re: [PATCH] hung_task: configurable hung-task stacktrace loglevel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <84tt63c9n7.fsf@jogness.linutronix.de>
Date: Fri, 02 May 2025 17:36:52 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>, Tomasz Figa <tfiga@...omium.org>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Ingo Molnar
 <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Juri Lelli
 <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>, Ben Segall
 <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, Steven Rostedt
 <rostedt@...dmis.org>, Andrew Morton <akpm@...ux-foundation.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hung_task: configurable hung-task stacktrace loglevel

On 2025-05-02, Petr Mladek <pmladek@...e.com> wrote:
>> The problem with the special lines is that it completely breaks any
>> line-based processing in a data pipeline. For a piece of
>> infrastructure that needs to deal with thousands of reports, on an
>> on-demand basis, that would mean quite a bit of sequential work done
>> instead of doing it in parallel and taking much more time to answer
>> users' queries.
>> 
>> That could be worked around, though, if we could prefix each line
>> separately with some special tag in addition to log level, timestamp
>> and caller, though. Borrowing from Sergey's earlier example:
>> 
>> <3>[  125.297687][  T140][E] INFO: task zsh:470 blocked for more than
>> 61 seconds.
>> <3>[  125.302321][  T140][E]       Not tainted
>> 6.15.0-rc3-next-20250424-00001-g258d8df78c77-dirty #154
>> <3>[  125.309333][  T140][E] "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> <6>[  125.315040][  T140][E] task:zsh             state:D stack:0
>> pid:470   tgid:470   ppid:430    task_flags:0x400100 flags:0x00004002
>> <6>[  125.320594][  T140][E] Call Trace:
>> <6>[  125.322327][  T140][E]  <TASK>
>> <6>[  125.323852][  T140][E]  __schedule+0x13b4/0x2120
>> <6>[  125.325459][  T140][E]  ? schedule+0xdc/0x280
>> <6>[  125.327100][  T140][E]  schedule+0xdc/0x280
>> <6>[  125.328590][  T140][E]  schedule_preempt_disabled+0x10/0x20
>> <6>[  125.330589][  T140][E]  __mutex_lock+0x698/0x1200
>> <6>[  125.332291][  T140][E]  ? __mutex_lock+0x485/0x1200
>> <6>[  125.334074][  T140][E]  mutex_lock+0x81/0x90
>> <6>[  125.335113][  T140][E]  drop_caches_sysctl_handler+0x3e/0x140
>> <6>[  125.336665][  T140][E]  proc_sys_call_handler+0x327/0x4f0
>> <6>[  125.338069][  T140][E]  vfs_write+0x794/0xb60
>> <6>[  125.339216][  T140][E]  ? proc_sys_read+0x10/0x10
>> <6>[  125.340568][  T140][E]  ksys_write+0xb8/0x170
>> <6>[  125.341701][  T140][E]  do_syscall_64+0xd0/0x1a0
>> <6>[  125.343009][  T140][E]  ? arch_exit_to_user_mode_prepare+0x11/0x60
>> <6>[  125.344612][  T140][E]  ? irqentry_exit_to_user_mode+0x7e/0xa0
>> <6>[  125.346260][  T140][E]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
>> 
>> where [E] would mean an "emergency" message, rather than something
>> usual, regardless of the loglevel.
>
> This is an interesting idea. It has several advantages. It would:
>
>   + still allow to filter out the extra details on too slow consoles [1]
>   + work even when the "cut here" prefix/postfix lines get lost
>   + obsolete the config option forcing the same loglevel in emergency
>       section => safe space in struct task_struct. [2]

So I guess this would introduce a new printk_info_flags emergency
flag. The information needs to be stored in the ringbuffer.

> [1] Note that there is still floating a patchset which allows to define
>      per-console loglevel, see
>      https://lore.kernel.org/r/cover.1730133890.git.chris@chrisdown.name
>
> [2] It might be eventually replaced by a config option which would show
>     all emergency messages on consoles.

Which, when enabled, would simply result in setting LOG_FORCE_CON
whenever the new emergency flag is set.

John