linux-kernel - Re: [PATCH 2/2] printk: always report lost messages on serial console

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170103165356.GN14894@pathway.suse.cz>
Date:   Tue, 3 Jan 2017 17:53:56 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Hurley <peter@...leysoftware.com>,
        linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Subject: Re: [PATCH 2/2] printk: always report lost messages on serial console

On Wed 2017-01-04 00:47:45, Sergey Senozhatsky wrote:
> On (01/03/17 15:55), Petr Mladek wrote:
> [..]
> > This causes the opposite problem. We might print a message that was supposed
> > to be suppressed.
> 
> so what? yes, we print a message that otherwise would have been suppressed.
> not a big deal. at all. we are under high printk load and the best thing
> we can do is to report "we are losing the messages" straight ahead. the
> next 'visible' message may be seconds/minutes/forever away. think of a
> printk() flood of messages with suppressed loglevel coming from CPUA-CPUX,
> big enough to drain all 'visible' loglevel messages from CPUZ. we are
> back to problem "a".
> 
> thus I want a simple bool flag and a simple rule: we see something - we say it.

So, you prefer to print some random debug message instead of an
emergency one? The console_level is there for a reason.

If there is a flood of messages, console_level = 1 and use your
solution, you might see:

** 1324 printk messages dropped ** <notice: random message>
** 4234 printk messages dropped ** <debug: random message>
** 3243 printk messages dropped ** <info: random message>
** 2343 printk messages dropped ** <debug: random message>

It will always drop a message because you always process only one
and many new appear in the meantime. While with my solution,
you should see:

** 1324 printk messages dropped ** <alert: random message>
** 523 printk messages dropped ** <emerg: random message>
** 324 printk messages dropped ** <emerg: random message>
** 345 printk messages dropped ** <alert: random message>

You will see messages filtered by the console_level. Also
less number of messages should get dropped because you quickly
skip the less important ones.

The filtering might be the only way to see the important
messages. On the other hand, the fact that messages are
dropped might be quessed from the context.

Your patch fixes a bug by introducing another bug that is
probably even more serious.


I understand that my patch is much more complex. But is
the final code realy more complex?

console_unlock() is too long. The helper for the msg_print()
calls makes sense on its own. The rest is just reshufling
of the conditions.

It actually would make sense to hide also the increment
of console_idx/console_seq into the helper function.
Especially console_idx is related to the struct printk_log
that is no longer accessed in console_unlock() directly.

Here is v2:

>From 82ff26726508c8a5a97409c7270d1f7abc7ed740 Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@...e.com>
Date: Tue, 3 Jan 2017 14:32:04 +0100
Subject: [PATCH v2] printk: Always report lost messages on serial console

The "printk messages dropped" report is 'attached' to a kernel
message located at console_idx offset. This does not work well
if we skip that message due to loglevel filtering, because in
this case we also skip/lose dropped message report.

A simple solution would be to ignore the level and always print
the warning with the next message.

But the situation suggests that we are under a high load and
could not afford printing less important messages.

Also we could not print only the warning because we might lose
even more messages in the meantime.

The best solution seems to be to print the warning with
the next visible message.

This patch tries to keep readability of the code. It puts
msg_print*() calls into a helper function. Also it hides there
the visibility check and increment of console_idx/console_seq.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Signed-off-by: Petr Mladek <pmladek@...e.com>
---
 kernel/printk/printk.c | 70 +++++++++++++++++++++++++++++---------------------
 1 file changed, 41 insertions(+), 29 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 0dbde4e7bb15..ae80dc3284dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2161,6 +2161,35 @@ static inline int can_use_console(void)
 	return cpu_online(raw_smp_processor_id()) || have_callable_console();
 }
 
+static bool console_check_and_print_next_msg(
+		char *text, size_t size, size_t *len,
+		char *ext_text, size_t ext_size, size_t *ext_len)
+{
+	struct printk_log *msg = log_from_idx(console_idx);
+	bool ret = true;
+
+	if (suppress_message_printing(msg->level)) {
+		ret = false;
+		goto out;
+	}
+
+	*len += msg_print_text(msg, false, text, size);
+	if (nr_ext_console_drivers) {
+		*ext_len = msg_print_ext_header(ext_text, ext_size,
+						msg, console_seq);
+		*ext_len += msg_print_ext_body(ext_text + *ext_len,
+					       ext_size - *ext_len,
+					       log_dict(msg), msg->dict_len,
+					       log_text(msg), msg->text_len);
+	}
+
+out:
+	console_idx = log_next(console_idx);
+	console_seq++;
+
+	return ret;
+}
+
 /**
  * console_unlock - unlock the console system
  *
@@ -2215,9 +2244,9 @@ void console_unlock(void)
 	}
 
 	for (;;) {
-		struct printk_log *msg;
+		bool visible_msg = false;
 		size_t ext_len = 0;
-		size_t len;
+		size_t len = 0;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
 		if (seen_seq != log_next_seq) {
@@ -2232,37 +2261,20 @@ void console_unlock(void)
 			/* messages are gone, move to first one */
 			console_seq = log_first_seq;
 			console_idx = log_first_idx;
-		} else {
-			len = 0;
 		}
-skip:
-		if (console_seq == log_next_seq)
-			break;
 
-		msg = log_from_idx(console_idx);
-		if (suppress_message_printing(msg->level)) {
-			/*
-			 * Skip record we have buffered and already printed
-			 * directly to the console when we received it, and
-			 * record that has level above the console loglevel.
-			 */
-			console_idx = log_next(console_idx);
-			console_seq++;
-			goto skip;
-		}
+		/* Get the next message with a visible level */
+		while (console_seq < log_next_seq && !visible_msg) {
+			visible_msg = console_check_and_print_next_msg(
+					text + len, sizeof(text) - len, &len,
+					ext_text, sizeof(ext_text), &ext_len);
 
-		len += msg_print_text(msg, false, text + len, sizeof(text) - len);
-		if (nr_ext_console_drivers) {
-			ext_len = msg_print_ext_header(ext_text,
-						sizeof(ext_text),
-						msg, console_seq);
-			ext_len += msg_print_ext_body(ext_text + ext_len,
-						sizeof(ext_text) - ext_len,
-						log_dict(msg), msg->dict_len,
-						log_text(msg), msg->text_len);
 		}
-		console_idx = log_next(console_idx);
-		console_seq++;
+
+		/* No warning and no visible message => done */
+		if (!len && !visible_msg)
+			break;
+
 		raw_spin_unlock(&logbuf_lock);
 
 		stop_critical_timings();	/* don't trace print latency */
-- 
1.8.5.6