linux-kernel - Re: [BUG] workqueues and printk not playing nice since next-20240130

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Fri, 2 Feb 2024 09:35:59 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: John Ogness <john.ogness@...utronix.de>
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
	Petr Mladek <pmladek@...e.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org
Subject: Re: [BUG] workqueues and printk not playing nice since next-20240130

On Fri, Feb 02, 2024 at 06:08:25PM +0106, John Ogness wrote:
> On 2024-02-02, "Paul E. McKenney" <paulmck@...nel.org> wrote:
> >> The printk ringbuffer contents would certainly be interesting.
> >> 
> >> If you build the GDB scripts (CONFIG_GDB_SCRIPTS) then you will have:
> >> 
> >> (gdb) lx-dmesg
> >
> > This says no such command even though I do have CONFIG_GDB_SCRIPTS=y
> > in my .config.
> 
> You actually need to build them as well. The target is "scripts_gdb"
> 
> And you probably need to add:
> 
> add-auto-load-safe-path /path/to/your/kernel/build/directory
> 
> to your .gdbinit
> 
> (This is documented in Documentation/dev-tools/gdb-kernel-debugging.rst)

Thank you!  Next time I am in a similar situation, I will pay more
attention to the documentation.

> >> As an alternative, you could copy the contents of
> >> Documentation/admin-guide/kdump/gdbmacros.txt into your .gdbinit and
> >> then will have:
> >> 
> >> (gdb) dmesg
> >
> > This one hangs.
> 
> :-/ I will look into this.
> 
> > On the other hand, next-20240202 doesn't show the problem.  No idea
> > what might have changed.  :-/
> 
> Did you check the backtrace on all the "threads"? I would expect one of
> them has tty in it and is probably deadlocked. There are known problems
> that if a WARN or lockdep triggers while holding the port lock, that CPU
> will deadlock itself. That has the effect that no output is generated,
> but all the other CPUs will run fine. And even printk() calls will
> happily store into the ringbuffer because they use trylock for printing
> and the deadlocked CPU will be holding the lock.

Again, thank you, and another thing for me to try should this start
happening again.

							Thanx, Paul