[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxZYKe0t7jWX-_1K@pathway.suse.cz>
Date: Mon, 21 Oct 2024 15:33:29 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Marcos Paulo de Souza <mpdesouza@...e.com>,
Steven Rostedt <rostedt@...dmis.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>, linux-kernel@...r.kernel.org,
linux-serial@...r.kernel.org
Subject: Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
On Fri 2024-10-18 09:20:19, John Ogness wrote:
> On 2024-10-17, Petr Mladek <pmladek@...e.com> wrote:
> > # echo h >/proc/sysrq-trigger
> >
> > produced:
> >
> > [ 53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
> > [ 53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
> > [ 53.669931] 2 locks held by bash/1637:
> > [ 53.669936] #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
> > [ 53.669968] #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
> > [ 53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
> > [ 53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
> > [ 53.670020] Call Trace:
> > [ 53.670026] <TASK>
> > [ 53.670045] dump_stack_lvl+0x6c/0xa0
> > [ 53.670064] __cant_migrate.cold+0x7c/0x89
> > [ 53.670080] printk_loud_console_enter+0x15/0x30
> > [ 53.670088] __handle_sysrq+0x60/0x120
> > [ 53.670104] write_sysrq_trigger+0x6a/0xa0
> > [ 53.670120] proc_reg_write+0x5f/0xb0
> > [ 53.670132] vfs_write+0xf9/0x540
> > [ 53.670147] ? __lock_release.isra.0+0x1a6/0x2c0
> > [ 53.670172] ? do_user_addr_fault+0x38c/0x720
> > [ 53.670197] ksys_write+0x6e/0xf0
> > [ 53.670220] do_syscall_64+0x79/0x190
> > [ 53.670238] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > IMHO, the best solution would be to call migrate_disable()/enable()
> > in printk_loud_console_enter()/exit().
>
> That will not work because migrate_enable() can only be called from
> can_sleep context. Instead, the migrate_disable()/enable() should be at
> the few (one?) call sites where printk_loud_console_enter()/exit() is
> used from task context.
Hmm, if I get it correctly, we could not use migrate_disable() in
__handle_sysrq() because it can be called also in atomic context,
for example:
+ pl010_int()
+ pl010_rx_chars()
+ uart_handle_sysrq_char()
+ handle_sysrq()
+ __handle_sysrq()
I do not see any easy way how to distinguish whether it was called in
an atomic context or not.
So, I see three possibilities:
1. Explicitly call preempt_disable() in __handle_sysrq().
It would be just around the the single line or the help. But still,
I do not like it much.
2. Avoid the per-CPU variable. Force adding the LOUD_CON/FORCE_CON
flag using a global variable, e.g. printk_force_console.
The problem is that it might affect also messages printed by
other CPUs. And there might be many.
Well, console_loglevel is a global variable. The original code
had a similar problem.
3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example,
by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED.
I might work well for __handle_sysrq() which calls the affected
printk() directly.
But it won't work, for example, for kdb_show_stack(). It wants
to show messages printed by a nested functions.
I personally prefer the 2nd variant. It fixes the problem and it
should not make things worse.
Best Regards,
Petr
Powered by blists - more mailing lists