linux-kernel - Re: [PATCH 2/2] Subject: printk: Don't trap random context in infinite log

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Nov 2017 05:23:50 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-kernel@...r.kernel.org, kernel-team@...com,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 2/2] Subject: printk: Don't trap random context in
 infinite log_buf flush

Hello, Sergey.

On Tue, Nov 07, 2017 at 11:04:34AM +0900, Sergey Senozhatsky wrote:
> just to make sure. there is a typo in Steven's patch:
> 
> 	while (!READ_ONCE(console_waiter))
> 
> should be
> 
> 	while (READ_ONCE(console_waiter))
> 
> is this the "tweaking" you are talking about?

Oh, I was talking about tweaking the repro, but I'm not sure the above
would change anything.  The problem that the repro demonstrates is a
message deluge involving an non-sleepable flusher + local irq (or
other atomic contexts) message producer.

In the above case, none of the involved contexts can serve as the
flusher for a long time without messing up the system.  If you wanna
allow printks to be async without falling into these lockups, you
gotta introduce an independent safe context to flush from.

> > > there are some concerns, like a huge number of printk-s happening while
> > > console_sem is locked. e.g. console_lock()/console_unlock() on one of the
> > > CPUs, or console_lock(); printk(); ... printk(); console_unlock();
> > 
> > Unless we make all messages fully synchronous, I don't think there's a
> > good solution for that and I don't think we wanna make everything
> > fully synchronous.
> 
> this is where it becomes complicated. offloading logic is not binary,
> unfortunately. we normally want to offload; but not always. things
> like sysrq or late PM warnings, or kexec, etc. want to stay fully sync,
> regardless the consequences. some of sysrq prints out even do
> touch_nmi_watchdog() and touch_all_softlockup_watchdogs(). current
> printk-kthread patch set tries to consider those cases and to avoid
> any offloading.

Yeah, sure, selectively opting out of asynchronous operation is a
different (solvable) issue.  Also, just to be clear, the proposed
patch doesn't make any of these worse in any meaningful way - e.g. we
could end up trapping a nice 20 task pinned to an overloaded CPU in
the flusher role.

The following is a completely untested patch to show how we can put
the console in full sync mode, just the general idea.  I'm a bit
skeptical we really wanna do this given that we already (with or
without the patch) stay sync for most of these events due to the way
we go async, but, yeah, if we wanna do that, we can do that.

Thanks.

---
 kernel/printk/printk.c |   40 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 5 deletions(-)

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -438,6 +438,7 @@ static char *log_buf = __log_buf;
 static u32 log_buf_len = __LOG_BUF_LEN;
 
 /* async flush */
+static bool console_sync_mode;
 static struct kthread_worker *console_async_worker;
 
 static void console_async_workfn(struct kthread_work *work);
@@ -2051,7 +2052,8 @@ void console_lock(void)
 {
 	might_sleep();
 
-	down_console_sem();
+	down_console_sem();	// make it return w/ irq disabled on locked
+				// and don't reenable till unlock if sync mode
 	if (console_suspended)
 		return;
 	console_locked = 1;
@@ -2069,8 +2071,22 @@ EXPORT_SYMBOL(console_lock);
  */
 int console_trylock(void)
 {
-	if (down_trylock_console_sem())
+	unsigned long flags;
+	bool locked;
+
+	printk_safe_enter_irqsave(flags);
+	do {
+		// we can make this nest safe if necessary by remembering the holding cpu
+		locked = down_trylock_console_sem();
+		if (locked)
+			break;
+		cpu_relax();
+	} while (!locked && READ_ONCE(console_sync_mode));
+	printk_safe_exit_irqrestore(flags);
+
+	if (!locked)
 		return 0;
+
 	if (console_suspended) {
 		up_console_sem();
 		return 0;
@@ -2087,7 +2103,8 @@ int console_trylock(void)
 	 * rcu_preempt_depth(), otherwise RCU read sections modify
 	 * preempt_count().
 	 */
-	console_may_schedule = !oops_in_progress &&
+	console_may_schedule = !console_sync_mode &&
+			!oops_in_progress &&
 			preemptible() &&
 			!rcu_preempt_depth();
 	return 1;
@@ -2218,7 +2235,8 @@ again:
 		 * more messages to be printed, we'll punt to the async
 		 * worker.
 		 */
-		if (!target_seq || current_is_console_async())
+		if (!target_seq || current_is_console_async() ||
+		    !console_sync_mode)
 			target_seq = log_next_seq;
 
 		if (console_seq < log_first_seq) {
@@ -2293,7 +2311,7 @@ skip:
 	raw_spin_unlock(&logbuf_lock);
 	printk_safe_exit_irqrestore(flags);
 
-	if (request_async) {
+	if (request_async && !console_sync_mode) {
 		if (console_async_worker) {
 			/* async is online, punt */
 			kthread_queue_work(console_async_worker,
@@ -2413,6 +2431,18 @@ void console_start(struct console *conso
 }
 EXPORT_SYMBOL(console_start);
 
+void console_enter_sync_mode(void)
+{
+	console_lock();
+	WRITE_ONCE(console_sync_mode, true);
+	console_unlock();
+}
+
+void console_leave_sync_mode(void)
+{
+	WRITE_ONCE(console_sync_mode, false);
+}
+
 static int __read_mostly keep_bootcon;
 
 static int __init keep_bootcon_setup(char *str)