linux-kernel - Re: [PATCH] printk: Ratelimit messages printed by console drivers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180416042553.GA555@jagdpanzerIV>
Date:   Mon, 16 Apr 2018 13:25:53 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>,
        Petr Mladek <pmladek@...e.com>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        akpm@...ux-foundation.org, linux-mm@...ck.org,
        Peter Zijlstra <peterz@...radead.org>, Jan Kara <jack@...e.cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCH] printk: Ratelimit messages printed by console drivers

On (04/16/18 10:47), Sergey Senozhatsky wrote:
> On (04/14/18 11:35), Sergey Senozhatsky wrote:
> > On (04/13/18 10:12), Steven Rostedt wrote:
> > > 
> > > > The interval is set to one hour. It is rather arbitrary selected time.
> > > > It is supposed to be a compromise between never print these messages,
> > > > do not lockup the machine, do not fill the entire buffer too quickly,
> > > > and get information if something changes over time.
> > > 
> > > 
> > > I think an hour is incredibly long. We only allow 100 lines per hour for
> > > printks happening inside another printk?
> > > 
> > > I think 5 minutes (at most) would probably be plenty. One minute may be
> > > good enough.
> > 
> > Besides 100 lines is absolutely not enough for any real lockdep splat.
> > My call would be - up to 1000 lines in a 1 minute interval.
> 
> Well, if we want to basically turn printk_safe() into printk_safe_ratelimited().
> I'm not so sure about it.
> 
> Besides the patch also rate limits printk_nmi->logbuf - the logbuf
> PRINTK_NMI_DEFERRED_CONTEXT_MASK bypass, which is way too important
> to rate limit it - for no reason.
> 
> Dunno, can we keep printk_safe() the way it is and introduce a new
> printk_safe_ratelimited() specifically for call_console_drivers()?
> 
> Lockdep splat is a one time event, if we lose half of it - we, most
> like, lose the entire report. And call_console_drivers() is not the
> one and only source of warnings/errors/etc. So if we turn printk_safe
> into printk_safe_ratelimited() [not sure we want to do it] for all
> then I want restrictions to be as low as possible, IOW to log_store()
> as many lines as possible.

One more thing,
I'd really prefer to rate limit the function which flushes per-CPU
printk_safe buffers; not the function that appends new messages to
the per-CPU printk_safe buffers. There is a significant difference.

printk_safe does not help us when we are dealing with any external
locks - and call_console_drivers() is precisely that type of case.
The very next thing to happen after lockdep splat, or spin_lock
debugging report, etc. can be an actual deadlock->panic(). Thus I
want to have the entire report in per-CPU buffer [if possible],
so we can flush_on_panic() per-CPU buffers, or at least move the
data to the logbuf and make it accessible in vmcore. If we rate
limit the function that appends data to the per-CPU buffer then we
may simply suppress [rate limit] the report, so there will be
nothing to flush_on_panic().

	-ss