lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxDl-VcVAI8DGM40@pathway.suse.cz>
Date: Thu, 17 Oct 2024 12:24:57 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Marcos Paulo de Souza <mpdesouza@...e.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>, linux-kernel@...r.kernel.org,
	linux-serial@...r.kernel.org
Subject: Re: [PATCH 1/2] printk: Introduce LOUD_CON flag

On Wed 2024-10-16 20:17:48, John Ogness wrote:
> Hi Marcus,
> 
> On 2024-10-16, Marcos Paulo de Souza <mpdesouza@...e.com> wrote:
> > Introduce LOUD_CON flag to printk.
> 
> Generally speaking, I do not like the name "LOUD_CON". The flag is
> related to records, not consoles. Something like "NO_SUPPRESS" or
> "FORCE_PRINT" might be more appropriate. Note that naming is not my
> strength.

I see. I would prefer to avoid names with double negatives.
So I prefer the "FORCE_PRINT" variant.

That said, I think that "FORCE_CON" might also be good.
Aka, the message forces consoles to show it.

> > The new flag will make it possible to
> > create a context where printk messages will never be suppressed. This
> > new context information will be stored in the already existing
> > printk_context per-CPU variable. This variable was changed from 'int' to
> > 'unsigned int' to avoid issues with automatic casting.
> >
> > This mechanism will be used in the next patch to create a loud_console
> > context on sysrq handling, removing an existing workaround on the
> > loglevel global variable. The workaround existed to make sure that sysrq
> > header messages were sent to all consoles.
> 
> IMO the more interesting aspect is that the "loud" flag is stored in the
> ringbuffer so that the message is not suppressed, even if printed later
> (for example because it was deferred). This actually even fixes a bug
> since the current workaround will not perform as expected if the sysrq
> records are deferred (for example due to threaded printing or consoles
> that are registered later).

Yeah. I think that the fix of this race is the main motivation for
the change. We should make it more clear in the commit message.

> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index beb808f4c367..b893825fe21d 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
> >  	unsigned long timeout;
> >  
> >  	if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
> > +		|| is_printk_console_loud()
> >  		|| suppress_message_printing(level)) {
> 
> I do not think "loud" should be a reason to skip the delays. The delays
> are there to slow down printing. I would think that for "loud" messages,
> this is even more important. I suppose this function (as well as
> printk_delay()) would need a new boolean parameter whether it is a
> "loud" message. Then:
> 
> 	|| (!loud_con && suppress_message_printing(level))

Yes, this makes more sense.

> > @@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
> >  	if (dev_info)
> >  		flags |= LOG_NEWLINE;
> >  
> > +	if (is_printk_console_loud())
> > +		flags |= LOG_LOUD_CON;
> > +
> >  	if (flags & LOG_CONT) {
> >  		prb_rec_init_wr(&r, reserve_size);
> >  		if (prb_reserve_in_last(&e, prb, &r, caller_id, PRINTKRB_RECORD_MAX)) {
> 
> I guess LOG_LOUD_CON should also be set in the LOG_CONT case (like
> LOG_NEWLINE does).

Great catch! Yes, it would make sense to set it even in the re-opened record.

> > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> > index 2b35a9d3919d..4618988baeea 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -12,7 +12,30 @@
> >  
> >  #include "internal.h"
> >  
> > -static DEFINE_PER_CPU(int, printk_context);
> > +static DEFINE_PER_CPU(unsigned int, printk_context);
> > +
> > +#define PRINTK_SAFE_CONTEXT_MASK		0x0000ffffU
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK	0xffff0000U
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET	0x00010000U
> > +
> > +void noinstr printk_loud_console_enter(void)
> > +{
> > +	cant_migrate();
> > +	this_cpu_add(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
> > +}
> 
> Have you tested this with lockdep? AFAICT, the write_sysrq_trigger()
> path can migrate since it is only using rcu_read_lock() in
> __handle_sysrq().

Interesting. I have always had connected rcu_read_lock() with
preempt_disable(). But I see that there actually exists preemptive
variant with CONFIG_PREEMPT_RCU. And I guess that preemptive RCU
read lock is a must-to-have on RT.

And indeed:

# echo h >/proc/sysrq-trigger

produced:

[   53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
[   53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
[   53.669931] 2 locks held by bash/1637:
[   53.669936]  #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
[   53.669968]  #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
[   53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
[   53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
[   53.670020] Call Trace:
[   53.670026]  <TASK>
[   53.670045]  dump_stack_lvl+0x6c/0xa0
[   53.670064]  __cant_migrate.cold+0x7c/0x89
[   53.670080]  printk_loud_console_enter+0x15/0x30
[   53.670088]  __handle_sysrq+0x60/0x120
[   53.670104]  write_sysrq_trigger+0x6a/0xa0
[   53.670120]  proc_reg_write+0x5f/0xb0
[   53.670132]  vfs_write+0xf9/0x540
[   53.670147]  ? __lock_release.isra.0+0x1a6/0x2c0
[   53.670172]  ? do_user_addr_fault+0x38c/0x720
[   53.670197]  ksys_write+0x6e/0xf0
[   53.670220]  do_syscall_64+0x79/0x190
[   53.670238]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   53.670252] RIP: 0033:0x7fa4247d9b13
[   53.670269] Code: 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89
[   53.670275] RSP: 002b:00007ffe62690f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   53.670282] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa4247d9b13
[   53.670288] RDX: 0000000000000002 RSI: 00005557ac516ec0 RDI: 0000000000000001
[   53.670291] RBP: 00005557ac516ec0 R08: 000000000000000a R09: 0000000000000000
[   53.670295] R10: 00007fa4246da468 R11: 0000000000000246 R12: 00007fa424abd500
[   53.670298] R13: 0000000000000002 R14: 00007fa424ac2c00 R15: 0000000000000002
[   53.670350]  </TASK>
[   53.670358] sysrq: HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) dump-ftrace-buffer(z) replay-kernel-logs(R) 


IMHO, the best solution would be to call migrate_disable()/enable()
in printk_loud_console_enter()/exit().

We should not disable preemption because the output could
be long.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ