linux-kernel - Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87shlhv3uv.fsf@xmission.com>
Date:   Sun, 09 Apr 2017 13:21:44 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>, Pavel Machek <pavel@....cz>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Jan Kara <jack@...e.cz>, Ye Xiaolong <xiaolong.ye@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Petr Mladek <pmladek@...e.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jiri Slaby <jslaby@...e.com>, Len Brown <len.brown@...el.com>,
        linux-kernel@...r.kernel.org, lkp@...org
Subject: Re: [printk]  fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage

Sergey Senozhatsky <sergey.senozhatsky@...il.com> writes:

> On (04/07/17 17:23), Peter Zijlstra wrote:
> [..]
>> > we are looking at different typical setups :) serial console being 45
>> > seconds behind logbuf does not surprise me anymore.
>> 
>> That does sound like you're doing something wrong and should look at
>> reducing printk() more than anything else.
>
> yeah, 45sec is an extreme case that simply doesn't surprise me anymore ;)
> that's not a normal/usual delay, of course, we are not this mad. on average
> it's much better and may be not so far 2 seconds after all. a massive OOM
> report, of course, appends logbuf messages at a much higher rate than UART
> serial console can swallow, so the delay is getting larger, expectedly.
> and, no, I don't add any printk-s, I'm looking at the lockup reports

Are you running your serial consoles at 9600 baud?

I would think the first thing to do would be to up your serial console
baud rate to 115200 or at least 38400.

Similarly anything the kernel is certain to survive I would set loglevel
such that it is logging somewhere with syslog rather than printk.

Of course my expectation on a production machine is to have panic on oom
set, to print the huge OOM message and then reboot.  So I don't possibly
see how offloading to another thread and then switching right back to
emergency mode is at all practical to solve the delay for a serious
situation like OOM.

It sounds like you are blaming printk when the problem is a very slow
logging device.

Eric