lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxZYKe0t7jWX-_1K@pathway.suse.cz>
Date: Mon, 21 Oct 2024 15:33:29 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Marcos Paulo de Souza <mpdesouza@...e.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>, linux-kernel@...r.kernel.org,
	linux-serial@...r.kernel.org
Subject: Re: [PATCH 1/2] printk: Introduce LOUD_CON flag

On Fri 2024-10-18 09:20:19, John Ogness wrote:
> On 2024-10-17, Petr Mladek <pmladek@...e.com> wrote:
> > # echo h >/proc/sysrq-trigger
> >
> > produced:
> >
> > [   53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
> > [   53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
> > [   53.669931] 2 locks held by bash/1637:
> > [   53.669936]  #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
> > [   53.669968]  #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
> > [   53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
> > [   53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
> > [   53.670020] Call Trace:
> > [   53.670026]  <TASK>
> > [   53.670045]  dump_stack_lvl+0x6c/0xa0
> > [   53.670064]  __cant_migrate.cold+0x7c/0x89
> > [   53.670080]  printk_loud_console_enter+0x15/0x30
> > [   53.670088]  __handle_sysrq+0x60/0x120
> > [   53.670104]  write_sysrq_trigger+0x6a/0xa0
> > [   53.670120]  proc_reg_write+0x5f/0xb0
> > [   53.670132]  vfs_write+0xf9/0x540
> > [   53.670147]  ? __lock_release.isra.0+0x1a6/0x2c0
> > [   53.670172]  ? do_user_addr_fault+0x38c/0x720
> > [   53.670197]  ksys_write+0x6e/0xf0
> > [   53.670220]  do_syscall_64+0x79/0x190
> > [   53.670238]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > IMHO, the best solution would be to call migrate_disable()/enable()
> > in printk_loud_console_enter()/exit().
> 
> That will not work because migrate_enable() can only be called from
> can_sleep context. Instead, the migrate_disable()/enable() should be at
> the few (one?) call sites where printk_loud_console_enter()/exit() is
> used from task context.

Hmm, if I get it correctly, we could not use migrate_disable() in
__handle_sysrq() because it can be called also in atomic context,
for example:

  + pl010_int()
    + pl010_rx_chars()
      + uart_handle_sysrq_char()
	+ handle_sysrq()
	  + __handle_sysrq()

I do not see any easy way how to distinguish whether it was called in
an atomic context or not.

So, I see three possibilities:

  1. Explicitly call preempt_disable() in __handle_sysrq().

     It would be just around the the single line or the help. But still,
     I do not like it much.


  2. Avoid the per-CPU variable. Force adding the LOUD_CON/FORCE_CON
     flag using a global variable, e.g. printk_force_console.

     The problem is that it might affect also messages printed by
     other CPUs. And there might be many.

     Well, console_loglevel is a global variable. The original code
     had a similar problem.


  3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example,
     by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED.

     I might work well for __handle_sysrq() which calls the affected
     printk() directly.

     But it won't work, for example, for kdb_show_stack(). It wants
     to show messages printed by a nested functions.


I personally prefer the 2nd variant. It fixes the problem and it
should not make things worse.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ