lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160629050835.GA730@swordfish>
Date:	Wed, 29 Jun 2016 14:08:35 +0900
From:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:	Petr Mladek <pmladek@...e.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Jan Kara <jack@...e.com>, Tejun Heo <tj@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	linux-kernel@...r.kernel.org,
	Byungchul Park <byungchul.park@....com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCH v12 0/3] printk: Make printk() completely async

Hello,

the patch to fix the async KERN_CONT printk regression I mentioned
several days ago in another thread.

as a separate patch for now, to ease the review. will squash with 0001.

============== 8< ==============

>From 70f65ed55eb82a1b8b74fdbd1a7afc9e77e9b380 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: [PATCH] printk: print KERN_CONT buffer in sync printk mode

One of my boxen has revealed that there is a race condition now between
pr_cot() output and printk_kthread wake_up() time. Schematically, the CPU
that triggered the race conditions was doing
something like this:

	for (...) {
		for (....) {
			...
			pr_cont(...);
			...
		}
		pr_cont("\n");
	}

the resulting output was:

$ dmesg

00000001 00000002 00000003 00000004 00000005 00000006 00000007
00000008
00000009 0000000a 0000000b 0000000c 0000000d 0000000e
0000000f 00000010 00000011 00000012 00000013 00000014 00000015
0000002b 0000002c 0000002d 0000002e 0000002f 00000030 00000031
....


KERN_CONT output is heavily relying on the fact there will be no other
CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s simultaneously.
This basically means that pr_cont() CPU is expected to be the one that actually
does the printing and log_store() of the entire cont buffer:

	pr_cont()
		console_unlock()
			console_cont_flush()
			log_store()	/* cont buffer */
	pr_cont()
		console_unlock()
			console_cont_flush()
			log_store()	/* cont buffer */
	....

Async printk breaks this contract, because console_cont_flush() is now deferred.
Thus pr_cont() will see a not fully flushed cont buffer, which will force it to
append a new KERN_CONT not to cont buffer, but to buffer log as a separate log
entry via log_store(). Due to the deferred nature of console_cont_flush() the
very next pr_cont() may see cont buffer being flushed, so it will keep the
message in the cont buffer and log_store() it later. So the cont line will split
across several log entries -- printk_kthread can take some time to wkeup,
during which pr_cont() will keep splitting cont line:

$ dmesg

00000001 00000002 00000003 00000004 00000005 00000006 00000007
00000008
00000009
0000000a
0000000b
0000000c
0000000d
0000000e
....

To fix the issue we need to switch to sync printk mode for KERN_CONT messages
and when we cont_flush() the cont buffer.


CPU0							CPU1

vprintk_emit(KERN_CONT "00000001")
	if (!(lflags & LOG_NEWLINE))
		cont_add()
			if (!cont.len)
				starts a new cont line
	wake_up() printk_kthread
...

vprintk_emit(KERN_CONT "0000004")			entering console_unlock()
	if (!(lflags & LOG_NEWLINE))
		cont_add()
	wake_up() printk_kthread

vprintk_emit(KERN_CONT "0000005")
	if (!(lflags & LOG_NEWLINE))
		cont_add()
	wake_up() printk_kthread			console_cont_flush()
								cont_print_text()

vprintk_emit(KERN_CONT "0000006")
	if (!(lflags & LOG_NEWLINE))
		cont_add()
	wake_up() printk_kthread
							leaving console_unlock()

vprintk_emit(KERN_CONT "0000007")
	if (!(lflags & LOG_NEWLINE))			entering console_unlock()
		cont_add()
	wake_up() printk_kthread			console_cont_flush()
								cont_print_text()
							leaving console_unlock()
vprintk_emit(KERN_CONT "\n")
	if (lflags & LOG_NEWLINE)
		if (cont.len) {
			cont_add()
			cont_flush()
				if (cont.cons) {
					cont.flags = flags;
					cont.flushed = true;
					^^^^^^^^^^^^^^^^^^^^
				}
			log_store() /* "0000001 .. 0000007" */
			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		}
	wake_up() process				leaving console_unlock()

vprintk_emit(KERN_CONT "0000008")
	if (!(lflags & LOG_NEWLINE))			entering console_unlock()
		cont_add()
		{
			if (cont.len && cont.flushed)
			^^^^^^^^^^^  not flushed yet!
				return false;
			...
		}

		log_store()	/* "0000008" */
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ !
	wake_up() printk_kthread
							console_cont_flush()
							^^^^^^^^^^^ now flushed!
								cont_print_text()
							call_console_drivers('\n');

vprintk_emit(KERN_CONT "0000009")
	if (!(lflags & LOG_NEWLINE))
		cont_add()
	wake_up() printk_kthread

							msg_print_text()
							call_console_drivers("0000008")
							^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
							leaving console_unlock()
...

vprintk_emit(KERN_CONT "\n")
	if (lflags & LOG_NEWLINE)
		log_store() /* "00000009 ... 0000000e" */
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ !

and so on.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
---
 kernel/printk/printk.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index eeafd73..bbb4180 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1705,6 +1705,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 	int printed_len = 0;
 	int nmi_message_lost;
 	bool in_sched = false;
+	/* do not print the KERN_CONT buffer asynchronously */
+	bool has_cont_msg = false;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
@@ -1804,6 +1806,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 		lflags |= LOG_PREFIX|LOG_NEWLINE;
 
 	if (!(lflags & LOG_NEWLINE)) {
+		has_cont_msg = true;
+
 		/*
 		 * Flush the conflicting buffer. An earlier newline was missing,
 		 * or another task also prints continuation lines.
@@ -1834,6 +1838,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 				stored = cont_add(facility, level, text,
 						  text_len);
 			cont_flush(LOG_NEWLINE);
+			has_cont_msg = true;
 		}
 
 		if (stored)
@@ -1859,8 +1864,9 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 *
 		 * However we resort to synchronous printing of messages during
 		 * early boot, when synchronous printing was explicitly
-		 * requested by a kernel parameter, or when console_verbose()
-		 * was called to print everything during panic / oops.
+		 * requested by a kernel parameter, when we have a KERN_CONT
+		 * buffer to print, or when console_verbose() was called to
+		 * print everything during panic / oops.
 		 * Unlike bust_spinlocks() and oops_in_progress,
 		 * console_verbose() sets console_loglevel to MOTORMOUTH and
 		 * never clears it, while oops_in_progress can go back to 0,
@@ -1868,7 +1874,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 * operate in sync mode once panic() occurred.
 		 */
 		if (console_loglevel != CONSOLE_LOGLEVEL_MOTORMOUTH &&
-				can_printk_async()) {
+				can_printk_async() &&
+				!has_cont_msg) {
 			/* Offload printing to a schedulable context. */
 			printk_kthread_need_flush_console = true;
 			wake_up_process(printk_kthread);
-- 
2.9.0.rc1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ