lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqEY6ZIB7XThgKW3@pathway.suse.cz>
Date: Wed, 24 Jul 2024 17:08:25 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Rik van Riel <riel@...riel.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Omar Sandoval <osandov@...a.com>, linux-kernel@...r.kernel.org,
	Steven Rostedt <rostedt@...dmis.org>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	kernel-team <kernel-team@...a.com>
Subject: Re: [RFC PATCH] nmi,printk: fix ABBA deadlock between nmi_backtrace
 and dump_stack_lvl

On Wed 2024-07-24 16:51:46, John Ogness wrote:
> On 2024-07-24, Petr Mladek <pmladek@...e.com> wrote:
> > My quess is that it looked like:
> >
> > CPU A				CPU B
> >
> > 				printk()
> > 				  console_try_lock_spinning()
> > 				  console_unlock()
> > 				    console_emit_next_record()
> > 				      console_lock_spinning_enable();
> > 					con->write()
> > 					  spin_lock(port->lock);
> >
> > printk_cpu_sync_get()
> >   printk()
> >     console_try_lock_spinning()
> >       # spinning and wating for CPU B
> >
> > 				NMI:
> >
> > 				  printk_cpu_sync_get()
> > 				    # waiting for CPU A
> >
> > => DEADLOCK
> >
> >
> > The deadlock is caused under/by printk_cpu_sync_get() but only because
> > console_try_lock_spinning() is blocked. It is not a true "try_lock"
> > operation which should never get blocked.
> >
> > => The above patch should solve the problem as well. It will cause
> >    that console_try_lock_spinning() would fail immediately on CPU A.
> >
> > Note that port->lock can't cause any deadlock in this scenario.
> > console_try_lock_spinning() will always fail on CPU A until
> > the NMI gets handled on CPU B.
> >
> > By other words, printk_cpu_sync_get() will behave as a tail lock
> > on CPU A because of the failing trylock.
> 
> But only in _this_ scenario. The port lock could be taken by CPU B for
> non-console-printing reasons. Then you still have deadlock, due to
> spinning on the port lock.

I see. I agree that deferring printk on that CPU [0] is the right solution.

> [0] https://lore.kernel.org/lkml/87plrcqyii.fsf@jogness.linutronix.de

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ