linux-kernel - Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181126043445.GB540@jagdpanzerIV>
Date:   Mon, 26 Nov 2018 13:34:45 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        Alexander Potapenko <glider@...gle.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>
Subject: Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep
 messages.

On (11/24/18 09:24), Tetsuo Handa wrote:
> >> Steven told me on Plumbers conference that even few initial
> >> characters saved him a day few times.
> > 
> > Yes, and that has happened more than once. I would reboot and retest
> > code that is crashing, and due to a triple fault, the machine would
> > reboot because of some race, and the little output I get from the
> > console would help tremendously.
> > 
> > Remember, debugging the kernel is a lot like forensics, especially when
> > it's from a customer's site. You look at all the evidence that you can
> > get, and sometimes it's just 10 characters in the output that gives you
> > an idea of where things went wrong. I'm really not liking the buffering
> > idea because of this.
> 
> Then, we should not enforce buffering in a way that requires modification of
> printk() callers. That is, we should not ask printk() callers to use their
> private buffer. What we can do is to enable/disable line buffering inside
> printk() depending on the problem the user wants to debug.

Right; overall I tend to agree with what you guys are saying and
I like Petr's "I am more and more wondering if the buffered printk
is worth the effort" comment; and I like Steven's comment on flushes;
and admire Tetsuo's efforts.

I think that printk_seq_buf/printk_buffer was never going to replace
pr_cont() and I never liked the idea. The printk_safe proposal for
lockdep had one OK thing about it - it would pass our normal marshaling
before it would reach the buffering stage. Which means - no buffering
for people who "detest" printk buffering.

This looks better:
	printk->vprint_func->{early_printk/printk_safe/vprintk_emit}->buffering

Than this:
	pr_buffer->buffering->vprintk_func->{early_printk/printk_safe/vprintk_emit}

Another thing is - printk seq_buf/printk_buffer doesn't really solve any
problem. People, who can use seq_buf/char buf[256]/etc. buffering, already
can do so; people who cannot - won't switch to a new buffering printk
anyway.

The bad thing about printk_safe proposal is that it's per-CPU; which
is OK for some paths (like lockdep), but not OK in general (e.g. OOM).

IMO, try_buffered_printk() attempts to solve the problem at the right
place - printk. And it does not break our normal marshaling, so we don't
"fix" printk users and we keep people, who does vprintk_func->early_printk
thing, happy. So I don't dislike try_buffered_printk() approach. And unlike
before, now we are talking about a single line buffering.

If we'd walk this way, I would prefer to NOT introduce any structs and
any new code, or any new "split and log_store() in the middle" rules. Just
a bunch of "struct cont" buffers:

	static struct cont conts[N];

and cont_add()/cont_flush() to handle pr_cont, with all the flushes it
does; but on a per-context basis.
conts[0] should serve as a fallback cont buffer, in case if there are
no available cont buffers left. flush_on_panic() is still miserable,
for sure; probably we can do something about it.

Or... Instead.
We can just leave pr_cont() alone for now. And make it possible to
reconstruct messages - IOW, inject some info to printk messages. We
do this at Samsung (inject CPU number at the beginning of every
message. `cat serial.0 | grep "\[1\]"` to grep for all messages from
CPU1). Probably this would be the simplest thing.

> Also, we should allow disabling "struct cont" depending on the problem (in
> order to allow flushing the 10 characters in the "cont" buffer).
>
> By the way, is the comment
>
>   /*
>    * Continuation lines are buffered, and not committed to the record buffer
>    * until the line is complete, or a race forces it. The line fragments
>    * though, are printed immediately to the consoles to ensure everything has
>    * reached the console in case of a kernel crash.
>    */

printk does not do this anymore; you are right.

> appropriate despite we don't call cont_flush() upon a kernel crash?

I tend to count on flush_on_panic more than on a "last moment"
pr_cont->cont_flush(), which was guaranteed to happen immediately
only with early_con.

A kernel crash usually has enough pr_emerg/printk-s to force cont flush.
Even pr_info() will do. We look at the loglevel much later; so even
messages which never make it to the consoles still flush cont buffer.

	-ss