linux-kernel - Re: [BUG] Threaded printk breaks early debugging

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YqcN9mH/aVwBoIMQ@alley>
Date:   Mon, 13 Jun 2022 12:14:14 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Sergey Senozhatsky <senozhatsky@...omium.org>
Cc:     John Ogness <john.ogness@...utronix.de>,
        Peter Geis <pgwipeout@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>
Subject: Re: [BUG] Threaded printk breaks early debugging

On Mon 2022-06-13 18:05:08, Sergey Senozhatsky wrote:
> On (22/06/13 10:36), John Ogness wrote:
> > >> IMHO, no. Especially in that situation, we do not want printk causing
> > >> that atomic section to become even longer. If the machine has entered
> > >> normal operation, we want printk out of the way.
> > >
> > > At the same time printk throttles itself in such cases: new messages are
> > > not added at much higher pace that they are printed at. So we lower the
> > > chances of missing messages.
> > 
> > That is true if there is only 1 printk caller.
> 
> Well, which is the case when num_online_cpus() == 1?

Good question.

Well, it would be nice to have the same behavior on single CPU
and SMP systems. Blocking atomic context with slow console is
bad even on single processor system. If there are problems with
lost messages then we will need a solution for SMP anyway.

> > For SMP systems with printing handovers, it might not help at all.
> > I firmly believe that sprinkling randomness into printk (i.e. system)
> > latencies is not the answer. We need to keep printk lockless and out
> > of the system's way unless there is a real emergency happening.
> 
> Yeah sure.
>
> > This particular thread is not about missed messages due to printk not
> > "throttling the system", but rather the kernel buffers not getting
> > flushed in an emergency. This, of course, needs to be properly handled.
> 
> True, but Peter mentioned
> 
>   "I noticed with threading enabled during large bursts the console
>    drops an excessive amount of messages. It's especially apparent
>    during the handover from earlycon to the normal console."

But this is also the situation when softlockups happen. It should
ideally be solved with a big enough buffer.

Another interesting alternative is the Peter Zijlstra's mode
where all messages are printed to the console "immediately".
They are serialized only by the CPU-reentrant lock.

This mode is not good for production system. But it might
be good for debugging. The good thing is that the behavior
is well defined.

I hope that we will get this mode with the atomic consoles.

Best Regards,
Petr