[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180306015222.GA6713@jagdpanzerIV>
Date: Tue, 6 Mar 2018 10:52:22 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
"Qixuan.Wu" <qixuan.wu@...ux.alibaba.com>,
linux-kernel-owner <linux-kernel-owner@...r.kernel.org>,
Petr Mladek <pmladek@...e.com>, Jan Kara <jack@...e.cz>,
linux-kernel <linux-kernel@...r.kernel.org>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
"chenggang.qin" <chenggang.qin@...ux.alibaba.com>,
caijingxian <caijingxian@...ux.alibaba.com>,
"yuanliang.wyl" <yuanliang.wyl@...baba-inc.com>,
Tejun Heo <tj@...nel.org>
Subject: Re: Would you help to tell why async printk solution was not taken
to upstream kernel ?
Hello Steven,
Let me Cc Tejun
On (03/05/18 15:58), Steven Rostedt wrote:
> On Mon, 5 Mar 2018 11:14:16 +0900
> Sergey Senozhatsky <sergey.senozhatsky.work@...il.com> wrote:
>
> > But I still think that it makes sense to change that "print it all" approach.
> > With more clear/explicit watchdog-dependent limits - we do direct printk for
> > 1/2 (or 2/3) of a current watchdog threshold value and offload if there is
> > more stuff in the logbuf. Implicit "logbuf size * console throughput" is
> > harder to understand. Disabling watchdog because of printk is a bit too much
> > of a compromise, probably.
>
> If you know the baud rate, logbuf size * console throughput is actually
> trivial to calculate.
>
> Let's see. CONFIG_LOG_BUF_SHIFT defaults to 18 (2^18 = 262144).
> Lets say we have a slow 9600 baud serial, which would give us:
>
> 262144 * 8 / 9600 = 219 (rounded up).
>
> Thus, the worse case scenario would be 219 seconds to output the entire
> buffer. Add 10 seconds more for extra overhead, and then you have 229
> second watchdog that should never trigger because of a very slow
> console.
>
> (A more common 151200 baud modem would empty the buffer in 14 seconds).
Right. And when you register one more console (e.g. net console), you need
to re-calculate and re-adjust watchdog. When you set kernel log_buf_len
param (e.g. you might do log_buf_len=32G to store ftrace dumps from NMI)
you need to re-calculate and re-adjust watchdog, etc.
> > IOW, is logbuf worth of messages so critically important after all that we
> > are ready to jeopardize the system stability?
>
> The stability is only in jeopardy if the watchdogs trigger, right?
Not limited to, watchdog threshold is at least deterministic.
Unlike, for instance, this guy
rcu_read_lock()
printk()
rcu_read_unlock()
It will block RCU grace periods. In the worst case this can become a
full-blown RCU stall and even OOM. In a less dramatic case this can
increase memory pressure, cause reclaimer activities, etc, which is not
a very good development, whether you have a small embedded device or a
server under high load, especially given that all you did was a bunch
of printks.
-ss
Powered by blists - more mailing lists