[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxDl-VcVAI8DGM40@pathway.suse.cz>
Date: Thu, 17 Oct 2024 12:24:57 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Marcos Paulo de Souza <mpdesouza@...e.com>,
Steven Rostedt <rostedt@...dmis.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>, linux-kernel@...r.kernel.org,
linux-serial@...r.kernel.org
Subject: Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
On Wed 2024-10-16 20:17:48, John Ogness wrote:
> Hi Marcus,
>
> On 2024-10-16, Marcos Paulo de Souza <mpdesouza@...e.com> wrote:
> > Introduce LOUD_CON flag to printk.
>
> Generally speaking, I do not like the name "LOUD_CON". The flag is
> related to records, not consoles. Something like "NO_SUPPRESS" or
> "FORCE_PRINT" might be more appropriate. Note that naming is not my
> strength.
I see. I would prefer to avoid names with double negatives.
So I prefer the "FORCE_PRINT" variant.
That said, I think that "FORCE_CON" might also be good.
Aka, the message forces consoles to show it.
> > The new flag will make it possible to
> > create a context where printk messages will never be suppressed. This
> > new context information will be stored in the already existing
> > printk_context per-CPU variable. This variable was changed from 'int' to
> > 'unsigned int' to avoid issues with automatic casting.
> >
> > This mechanism will be used in the next patch to create a loud_console
> > context on sysrq handling, removing an existing workaround on the
> > loglevel global variable. The workaround existed to make sure that sysrq
> > header messages were sent to all consoles.
>
> IMO the more interesting aspect is that the "loud" flag is stored in the
> ringbuffer so that the message is not suppressed, even if printed later
> (for example because it was deferred). This actually even fixes a bug
> since the current workaround will not perform as expected if the sysrq
> records are deferred (for example due to threaded printing or consoles
> that are registered later).
Yeah. I think that the fix of this race is the main motivation for
the change. We should make it more clear in the commit message.
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index beb808f4c367..b893825fe21d 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
> > unsigned long timeout;
> >
> > if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
> > + || is_printk_console_loud()
> > || suppress_message_printing(level)) {
>
> I do not think "loud" should be a reason to skip the delays. The delays
> are there to slow down printing. I would think that for "loud" messages,
> this is even more important. I suppose this function (as well as
> printk_delay()) would need a new boolean parameter whether it is a
> "loud" message. Then:
>
> || (!loud_con && suppress_message_printing(level))
Yes, this makes more sense.
> > @@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
> > if (dev_info)
> > flags |= LOG_NEWLINE;
> >
> > + if (is_printk_console_loud())
> > + flags |= LOG_LOUD_CON;
> > +
> > if (flags & LOG_CONT) {
> > prb_rec_init_wr(&r, reserve_size);
> > if (prb_reserve_in_last(&e, prb, &r, caller_id, PRINTKRB_RECORD_MAX)) {
>
> I guess LOG_LOUD_CON should also be set in the LOG_CONT case (like
> LOG_NEWLINE does).
Great catch! Yes, it would make sense to set it even in the re-opened record.
> > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> > index 2b35a9d3919d..4618988baeea 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -12,7 +12,30 @@
> >
> > #include "internal.h"
> >
> > -static DEFINE_PER_CPU(int, printk_context);
> > +static DEFINE_PER_CPU(unsigned int, printk_context);
> > +
> > +#define PRINTK_SAFE_CONTEXT_MASK 0x0000ffffU
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK 0xffff0000U
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET 0x00010000U
> > +
> > +void noinstr printk_loud_console_enter(void)
> > +{
> > + cant_migrate();
> > + this_cpu_add(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
> > +}
>
> Have you tested this with lockdep? AFAICT, the write_sysrq_trigger()
> path can migrate since it is only using rcu_read_lock() in
> __handle_sysrq().
Interesting. I have always had connected rcu_read_lock() with
preempt_disable(). But I see that there actually exists preemptive
variant with CONFIG_PREEMPT_RCU. And I guess that preemptive RCU
read lock is a must-to-have on RT.
And indeed:
# echo h >/proc/sysrq-trigger
produced:
[ 53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
[ 53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
[ 53.669931] 2 locks held by bash/1637:
[ 53.669936] #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
[ 53.669968] #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
[ 53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
[ 53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
[ 53.670020] Call Trace:
[ 53.670026] <TASK>
[ 53.670045] dump_stack_lvl+0x6c/0xa0
[ 53.670064] __cant_migrate.cold+0x7c/0x89
[ 53.670080] printk_loud_console_enter+0x15/0x30
[ 53.670088] __handle_sysrq+0x60/0x120
[ 53.670104] write_sysrq_trigger+0x6a/0xa0
[ 53.670120] proc_reg_write+0x5f/0xb0
[ 53.670132] vfs_write+0xf9/0x540
[ 53.670147] ? __lock_release.isra.0+0x1a6/0x2c0
[ 53.670172] ? do_user_addr_fault+0x38c/0x720
[ 53.670197] ksys_write+0x6e/0xf0
[ 53.670220] do_syscall_64+0x79/0x190
[ 53.670238] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 53.670252] RIP: 0033:0x7fa4247d9b13
[ 53.670269] Code: 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89
[ 53.670275] RSP: 002b:00007ffe62690f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 53.670282] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa4247d9b13
[ 53.670288] RDX: 0000000000000002 RSI: 00005557ac516ec0 RDI: 0000000000000001
[ 53.670291] RBP: 00005557ac516ec0 R08: 000000000000000a R09: 0000000000000000
[ 53.670295] R10: 00007fa4246da468 R11: 0000000000000246 R12: 00007fa424abd500
[ 53.670298] R13: 0000000000000002 R14: 00007fa424ac2c00 R15: 0000000000000002
[ 53.670350] </TASK>
[ 53.670358] sysrq: HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) dump-ftrace-buffer(z) replay-kernel-logs(R)
IMHO, the best solution would be to call migrate_disable()/enable()
in printk_loud_console_enter()/exit().
We should not disable preemption because the output could
be long.
Best Regards,
Petr
Powered by blists - more mailing lists