linux-kernel - Re: POC: Alternative solution: Re: [PATCH 0/4] printk: reimplement LOG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200814033424.GA582@jagdpanzerIV.localdomain>
Date:   Fri, 14 Aug 2020 12:34:24 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky@...il.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        kexec@...ts.infradead.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: POC: Alternative solution: Re: [PATCH 0/4] printk: reimplement
 LOG_CONT handling

On (20/08/13 12:35), John Ogness wrote:
> I believe I failed to recognize the fundamental problem. The fundamental
> problem is that the pr_cont() semantics are very poor.

The semantics is pretty clear - use it only in UP early bootup,
anything else is broken :)

  /*
   * Annotation for a "continued" line of log printout (only done after a
   * line that had no enclosing \n). Only to be used by core/arch code
   * during early bootup (a continued line is not SMP-safe otherwise).
   */
  #define KERN_CONT	KERN_SOH "c"

> I now strongly believe that we need to fix those semantics by having the
> pr_cont() user take responsibility for buffering the message. Patching the
> ~2000 pr_cont() users will be far easier than continuing to twist ourselves
> around this madness.

I welcome this effort. We've been talking about the fact that pr_cont() is
not something we can ignore anymore (we have more and more SMP users of
it) since the Kernel Summit in Santa Fe, NM, but the general response back
then was "oh my god, who cares" (pretty sure this is very close to what Ted
Ts'o said during the printk session).

> Here is an example for a new pr_cont() API:
> 
>     struct pr_cont c;
> 
>     pr_cont_alloc_info(&c);
>        (or alternatively)
>     dev_cont_alloc_info(dev, &c);
> 
>     pr_cont(&c, "1");
>     pr_cont(&c, "2");
> 
>     pr_cont_flush(&c);

This might be a bit more complex.

One thing that we need to handle here, I believe, is that the context
which crashes the kernel should flush its cont buffer, because the
information there is relevant to the crash:

	pr_cont_alloc_info(&c);
	pr_cont(&c, "1");
	pr_cont(&c, "2");
	>>
	   oops
	      panic()
	<<
	pr_cont_flush(&c);

We better flush that context's pr_cont buffer during panic().

Another example:


	pr_cont_alloc_info(&c);

	for (i = 0; i < p->sz; i++)
		pr_cont(&c, p->buf[i]);
	>>
	   page fault
	    exit
	<<
	pr_cont_flush(&c);

I believe we need to preliminary flush pr_cont() in this case as well,
because the information there might be very helpful.

	-ss