linux-kernel - Re: NMI watchdog dump does not print on hard lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20181023064904.GB504@jagdpanzerIV>
Date:   Tue, 23 Oct 2018 15:49:04 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Petr Mladek <pmladek@...e.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: NMI watchdog dump does not print on hard lockup

On (10/16/17 10:15), Steven Rostedt wrote:
> On Mon, 16 Oct 2017 22:13:05 +0900
> Sergey Senozhatsky <sergey.senozhatsky@...il.com> wrote:
> 
> > just "brainstorming" it... with some silly ideas.
> > 
> > pushing the data from NMI panic might look like we are replacing one
> > deadlock scenario with another deadlock scenario. some of the console
> > drivers are soooo complex internally. so I have been thinking about...
> > may be we can extend struct console and add ->write_on_panic() and that
> > handler must be as lockless as possible; so lockless that calling it
> > from anything that is not panic() is a severe bug.
> 
> This may not be a bad idea. And make it so it can't be called unless we
> are in panic mode (or at least "oops in progress").
> 
> If oops_in_progress is set, and the console has a "write_on_panic"
> handler, then just call that.

Good news Steven.

It turned out that some of serial consoles already have this
write_on_panic() mechanism enabled. Such consoles have the following
thing is their usual ->write() callbacks (which we call from printk()):

static void serial_console_write(struct console *co, const char *s,
                                 unsigned count)
{
...
        if (port->sysrq)
                locked = 0;
        else if (oops_in_progress)
                locked = spin_trylock_irqsave(&port->lock, flags);
        else
                spin_lock_irqsave(&port->lock, flags);
...

        uart_console_write(port, s, count, serial_console_putchar);
...
        if (locked)
                spin_unlock_irqrestore(&port->lock, flags);
}

Notice the special handling of port->sysrq and oops_in_progress cases.

So we, basically, already have "lockless on panic" serial consoles.
The problem is - it seems that panic() does not always let lockless
consoles to be lockless. I'm trying to address this in [1].

[1] lkml.kernel.org/r/20181016050428.17966-2-sergey.senozhatsky@...il.com

	-ss