lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 May 2020 10:35:57 +0100
From:   Daniel Thompson <daniel.thompson@...aro.org>
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:     Petr Mladek <pmladek@...e.com>, Sumit Garg <sumit.garg@...aro.org>,
        Jason Wessel <jason.wessel@...driver.com>,
        Douglas Anderson <dianders@...omium.org>,
        kgdb-bugreport@...ts.sourceforge.net,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        John Ogness <john.ogness@...utronix.de>
Subject: Re: [PATCH] printk/kdb: Redirect printk messages into kdb in any
 context

On Wed, May 20, 2020 at 01:21:02PM +0900, Sergey Senozhatsky wrote:
> On (20/05/18 11:21), Petr Mladek wrote:
> [..]
> > > > Is this guaranteed that we never execute this path from NMI?
> > 
> > Good question!
> > 
> > > Absolutely not.
> > > 
> > > The execution context for kdb is pretty much unique... we are running a
> > > debug mode with all CPUs parked in a holding loop with interrupts
> > > disabled. One CPU is at an unknown exception state and the others are
> > > either handling an IRQ or NMI depending on architecture[1].
> > 
> > This is similar to the situation in panic() when other CPUs are
> > stopped. It is more safe when the CPUs are stopped using IRQ.
> > There is higher danger of a deadlock when NMI is used.
> > 
> > bust_spinlock() is used in panic() to increase the chance to go over
> > the deadlock and actually see the messages. It is not enough when
> > more locks are used by the console (VT/TTY is good example). And
> > it is not guaranteed that the console will still work after
> > the hack is disabled by bust_spinlocks(0).
> 
> Good point. It's not guaranteed to help, but bust_spinlocks() does
> help in general, many serial drivers do check oops_in_progress and
> use a deadlock safe approach when locking port lock. I don't see
> bust_spinlocks() being used in kdb, so it probably better start
> doing so (along with general for_each_console() loop improvements,
> like checking if console is enabled/available/etc).

Agree.

I am also interested in whether we can figure out a way to match
consoles and polling drivers. It is better to use the polling
driver rather than the console since the polling drivers "know"
they will execute from all sorts of crazy places. For the most common
use cases this would also result in no console handler ever being
called.

BTW for those asking how this issue an submarine for so long I think
the main factor is that not all architectures implement NMI.

There are exceptions but kdb is typically only useful when either:

1. We have a real (e.g. not USB) serial port, or
2. We have a real (e.g. not USB) keyboard

On x86, where the SMP peers are rounded up using using an NMI, the
above grows increasingly hard to find (although they are certainly
still here). Combined with this very few commonly embedded
architectures round up using an NMI so these problems cannot occur.

This has allowed kdb to fall rather far behind the rest of the kernel
w.r.t. NMI robustness whilst not being entirely useless.

Sumit's recent work to exploit NMIs on arm64 is bringing some of our
debt to the surface... happily I think that will also help us to tackle
it!


> [..]
> > > > If so, can this please be added to the commit message? A more
> > > > detailed commit message will help a lot.
> > 
> > What about?
> > 
> > "KDB has to get messages on consoles even when the system is stopped.
> > It uses kdb_printf() internally and calls console drivers on its own.
> > 
> > It uses a hack to reuse an existing code. It sets "kdb_trap_printk"
> > global variable to redirect even the normal printk() into the
> > kdb_printf() variant.
> > 
> > The variable "kdb_trap_printk" is checked in printk_default() and
> > it is ignored when printk is redirected to printk_safe in NMI context.
> > Solve this by moving the check into printk_func().

Maybe a "Currently" thrown in we switch from general background
information to describing the problem the patch is about to fix helps us
read it:

  Currently the variable "kdb_trap_printk" is checked

But other than that LGTM.


Daniel.

> > 
> > It is obvious that it is not fully safe. But it does not make things
> > worse. The console drivers are already called in this context by
> > kdb_printf() direct calls."
> 
> This looks more informative indeed. Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ