linux-kernel - Re: [RFC PATCH v1 08/25] printk: add ring buffer and kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190312103017.rvdtoc3j34eayleb@pathway.suse.cz>
Date:   Tue, 12 Mar 2019 11:30:17 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Daniel Wang <wonderfly@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Alan Cox <gnomes@...rguk.ukuu.org.uk>,
        Jiri Slaby <jslaby@...e.com>,
        Peter Feiner <pfeiner@...gle.com>, linux-serial@...r.kernel.org
Subject: Re: [RFC PATCH v1 08/25] printk: add ring buffer and kthread

On Mon 2019-03-11 11:51:49, John Ogness wrote:
> On 2019-03-07, Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:
> > I don't really understand the role of loglevel anymore.
> 
> "what the kernel considers" is a configuration option of the
> administrator. The administrator can increase the verbocity of the
> console (loglevel) without having negative effects on the system
> itself.

Where do you get the confidence that atomic console will not
slow down the system? Have you tried it on real life workload
when debugging a real life bug?

Some benchmarks might help. Well, it would be needed to
trigger some messages from them and see how the different
approaches affect the overall system performance.

> Also, if the system were to suddenly crash, those crash messages
> shouldn't be in jeopardy just because the verbocity of the console was
> turned up.

This expects that the error messages will be enough to discover
and fix the problem.

> You (and Petr) talk about that _all_ console printing is for
> emergencies. That if an administrator sets the loglevel to 7 it is
> because the pr_info messages are just as important as the pr_emerg.

It might be true when the messages with higher level (more critical)
are not enough to understand the situation.

> And if that is indeed the intention of console printing and loglevel, then
> why is asynchronous printk calls for console messages even allowed
> today? IMO that isn't taking the importance of the message very
> seriously.

Because it was working pretty well in the past. The amount of messages
is still growing (code complexity, more CPUs, more devices, ...).
Our customers have started reporting softlockups "only" 7 years ago
or so.

We currently have two level handling of messages:

   + all messages can be seen from userspace
   + messages below console_loglevel can be seen
     on the console

You are introducing one more level of handling:

   + critical messages are printed on the console directly
     even before the queued less critical ones

The third level would be acceptable when:

   + atomic consoles are reliable enough
   + the code complexity is worth the gain

IMHO, we mix too many things here:

   + log buffer implementation
   + console offload
   + direct console handling using atomic consoles

I see the potential in all areas:

   + lock less ring buffer helps to avoid deadlocks,
     and extra log buffers

   + console offload prevents too long stalls (softlockups)

   + direct console handling might help to avoid deadlocks
     and might make the output more reliable.

I think that we are on the same page here.

But we must use an incremental approach. It is not acceptable
to replace everything by a single patch. And it is not acceptable
to break important functionality and implement alternative
solution several patches later.

Also no solution is as ideal as it is sometimes presented
in this thread.

Best Regards,
Petr