linux-kernel - Re: Would you help to tell why async printk solution was not taken to upstream kernel ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180306015222.GA6713@jagdpanzerIV>
Date:   Tue, 6 Mar 2018 10:52:22 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        "Qixuan.Wu" <qixuan.wu@...ux.alibaba.com>,
        linux-kernel-owner <linux-kernel-owner@...r.kernel.org>,
        Petr Mladek <pmladek@...e.com>, Jan Kara <jack@...e.cz>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        "chenggang.qin" <chenggang.qin@...ux.alibaba.com>,
        caijingxian <caijingxian@...ux.alibaba.com>,
        "yuanliang.wyl" <yuanliang.wyl@...baba-inc.com>,
        Tejun Heo <tj@...nel.org>
Subject: Re: Would you help to tell why async printk solution was not taken
 to upstream kernel ?

Hello Steven,

Let me Cc Tejun

On (03/05/18 15:58), Steven Rostedt wrote:
> On Mon, 5 Mar 2018 11:14:16 +0900
> Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:
> 
> > But I still think that it makes sense to change that "print it all" approach.
> > With more clear/explicit watchdog-dependent limits - we do direct printk for
> > 1/2 (or 2/3) of a current watchdog threshold value and offload if there is
> > more stuff in the logbuf. Implicit "logbuf size * console throughput" is
> > harder to understand. Disabling watchdog because of printk is a bit too much
> > of a compromise, probably.
> 
> If you know the baud rate, logbuf size * console throughput is actually
> trivial to calculate.
> 
> Let's see. CONFIG_LOG_BUF_SHIFT defaults to 18 (2^18 = 262144).
> Lets say we have a slow 9600 baud serial, which would give us:
>
>  262144 * 8 / 9600 = 219 (rounded up).
> 
> Thus, the worse case scenario would be 219 seconds to output the entire
> buffer. Add 10 seconds more for extra overhead, and then you have 229
> second watchdog that should never trigger because of a very slow
> console.
>
> (A more common 151200 baud modem would empty the buffer in 14 seconds).

Right. And when you register one more console (e.g. net console), you need
to re-calculate and re-adjust watchdog. When you set kernel log_buf_len
param (e.g. you might do log_buf_len=32G to store ftrace dumps from NMI)
you need to re-calculate and re-adjust watchdog, etc.

> > IOW, is logbuf worth of messages so critically important after all that we
> > are ready to jeopardize the system stability?
> 
> The stability is only in jeopardy if the watchdogs trigger, right?

Not limited to, watchdog threshold is at least deterministic.
Unlike, for instance, this guy

	rcu_read_lock()
	 printk()
	rcu_read_unlock()

It will block RCU grace periods. In the worst case this can become a
full-blown RCU stall and even OOM. In a less dramatic case this can
increase memory pressure, cause reclaimer activities, etc, which is not
a very good development, whether you have a small embedded device or a
server under high load, especially given that all you did was a bunch
of printks.

	-ss