lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 May 2015 14:00:27 +0200
From:	Petr Mladek <pmladek@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dave Anderson <anderson@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Kay Sievers <kay@...y.org>, Jiri Kosina <jkosina@...e.cz>,
	Michal Hocko <mhocko@...e.cz>, Jan Kara <jack@...e.cz>,
	linux-kernel@...r.kernel.org, Wang Long <long.wanglong@...wei.com>,
	peifeiyue@...wei.com, dzickus@...hat.com, morgan.wang@...wei.com,
	sasha.levin@...cle.com
Subject: Re: [PATCH 01/10] printk: Avoid deadlock in NMI context

On Wed 2015-05-27 16:13:46, Andrew Morton wrote:
> On Mon, 25 May 2015 14:46:24 +0200 Petr Mladek <pmladek@...e.cz> wrote:
> 
> > printk() cannot be used in NMI context safely because it uses an internal
> > lock and thus could cause a deadlock. This is fine because NMI context
> > is very special. The handlers should be short, effective, and do not
> > use printk().
> > 
> > But developers tend to print warnings even from NMI code. They are pretty
> > hard to debug when they lockup the machine and nothing appears in the logs.
> > 
> > This patch prevents from the deadlock on logbuf_lock by using trylock
> > rather than spin_lock. If the lock can not be taken, the message is
> > ignored and some warning is printed later on.
> > 
> > We also must not try to get console from NMI context. It needs
> > even more locks and there is even higher chance to hung up.
> > 
> > Unfortunately, we could not print more details about the lost message.
> > We could not alloc a buffer in NMI. Therefore we would need some
> > lockless mechanism to share a buffer between NMI and normal context.
> > But this would make printk() code much more complicated and
> > it is not worth it. There has already been an attempt to do so
> > and it has been rejected, see https://lkml.org/lkml/2014/6/10/388
> > This is also the reason why we use the atomic counter.
> 
> hm, I expect it wouldn't be too messy to shove the text into a static
> per-cpu buffer.  So we at least get a few hundred bytes of stuff.

The problem is that we would need to read the static buffer in the normal
context without a lock. The result might be a messy message. Or I
could add some lock-less hackery to keep some consistency but this
would make the code more complex.

In each case, we will not be able to preserve all messages. So, I am
not sure if any more complex solution is worth doing.

Note that we are talking about a corner case. printk() should not be
used in NMI in the first place. If it is used, we still do our best
to get it out. We try to get Oops messages even harder out. If the
message is lost, it might mean some flood of printk()s and
the message might get lost anyway.

> 
> > +		/* emit KERN_CRIT message */
> > +		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
>   +					 NULL, 0, text, text_len);
> 
> s/2/LOGLEVEL_CRIT/

Good point.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ