lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Oct 2016 16:44:51 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Joe Perches <joe@...ches.com>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Tejun Heo <tj@...nel.org>, Calvin Owens <calvinowens@...com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: linux.git: printk() problem

On Mon 2016-10-24 19:22:59, Linus Torvalds wrote:
> On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky
> > <sergey.senozhatsky.work@...il.com> wrote:
> >>
> >> I think cont_flush() should grab the logbuf_lock lock, because
> >> it does log_store() and touches the cont.len. so something like
> >> this perhaps
> >
> > Absolutely. Good catch.
> 
> Actually, you can't do it the way you did (inside cont_flush), because
> "cont_flush()" is already called with logbuf_lock held in most cases
> (see "cont_add()").
> 
> So it's really just the timer function that needs to take the
> logbuf_lock before it calls cont_flush().
> 
> So here's a new version. How does this look to you?
> 
> Again, this still tests "cont.len" outside the lock (not just in
> console_unlock(), but also in deferred_cont_flush()). And it's fine:
> even if it sees the "wrong" value due to some race, it does so either
> because cont.len was just set to non-zero (and whoever set it will
> force the re-check anyway), or it got cleared just as it was tested
> (and at worst you end up with an extra timer invocation).
> 
> +static void flush_timer(unsigned long data)
> +{
> +	unsigned long flags;
> +	bool did_flush;
> +
> +	raw_spin_lock_irqsave(&logbuf_lock, flags);
> +	did_flush = cont_flush();
> +	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	if (did_flush)
> +		wake_up_klogd();

We wake only klogd and syslog but not console here. Same problem is
also with some other recently added cont_flush() calls, e.g. in
kmsg_dump(), kmsg_dump_get_line().

BTW: It should be safe to call wake_up_klogd() in the locked area.
It just modifies a per-CPU variable and queues IRQ work.

I have played with it a bit and the result is below. Feel free
to just use the pieces or the idea if you like a part of it.


>From 226c4bfbe9f81bdd3b91b6b0ff3e268b1c09085f Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@...e.com>
Date: Tue, 25 Oct 2016 15:23:13 +0200
Subject: [PATCH] printk: Poke console when cont buffer is flushed

The commit bfd8d3f23b51018388be041 ("printk: make reading the kernel
log flush pending lines") allows to add new message into the log
buffer without flushing it to the console.

This patch reworks wake_up_klogd() to a generic printk_poke().
It pokes the loggers or console or both where necessary.

I have renamed some some variables and functions. I am not
if it is better readable now. Anyway, the old names came
from times before printk_deferred() when the irq work
handled only klogd.

I am not super happy the new state because the irq work is
called from many locations and there are cycles.
The cycle breaks when cont.len is zero but it might
be worth some clean up. For example, deferred_cont_flush()
call in printk_poke_func() looks superfluous. But
I wanted to be rather conservative for the moment.

Also I wonder if we really need per-CPU variable for
the printk_pending/printk_need_poke flag. A global
variable might help to reduce some calls and it would
not complicate the code.

This patch is based on top of the fix proposed at
https://lkml.kernel.org/r/CA+55aFwKYnrMJr_vSE+GfDGszeUGyd=CPUD15-zZ8yWQW61GBA@mail.gmail.com

Signed-off-by: Petr Mladek <pmladek@...e.com>
---
 include/linux/printk.h |  7 ++++--
 kernel/printk/printk.c | 67 ++++++++++++++++++++++++++++----------------------
 lib/bust_spinlocks.c   |  2 +-
 3 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index eac1af8502bb..6fdced13d2c9 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -31,6 +31,9 @@ static inline const char *printk_skip_level(const char *buffer)
 	return buffer;
 }
 
+#define PRINTK_POKE_CONSOLE	0x01
+#define PRINTK_POKE_LOGGERS	0x02
+
 #define CONSOLE_EXT_LOG_MAX	8192
 
 /* printk's without a loglevel use this.. */
@@ -185,7 +188,7 @@ extern bool printk_timed_ratelimit(unsigned long *caller_jiffies,
 devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void __user *buf,
 			  size_t *lenp, loff_t *ppos);
 
-extern void wake_up_klogd(void);
+extern void printk_poke(int poke);
 
 char *log_buf_addr_get(void);
 u32 log_buf_len_get(void);
@@ -220,7 +223,7 @@ static inline bool printk_timed_ratelimit(unsigned long *caller_jiffies,
 	return false;
 }
 
-static inline void wake_up_klogd(void)
+static inline void printk_poke(int poke)
 {
 }
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e63aa679614e..2c231bb73744 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1620,13 +1620,11 @@ static bool cont_flush(void)
 static void flush_timer(unsigned long data)
 {
 	unsigned long flags;
-	bool did_flush;
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	did_flush = cont_flush();
+	if (cont_flush())
+		printk_poke(PRINTK_POKE_CONSOLE | PRINTK_POKE_LOGGERS);
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
-	if (did_flush)
-		wake_up_klogd();
 }
 
 static void deferred_cont_flush(void)
@@ -2233,7 +2231,7 @@ void console_unlock(void)
 	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
-	bool wake_klogd = false;
+	bool poke_loggers = false;
 	bool do_cond_resched, retry;
 
 	if (console_suspended) {
@@ -2274,7 +2272,7 @@ void console_unlock(void)
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
 		if (seen_seq != log_next_seq) {
-			wake_klogd = true;
+			poke_loggers = true;
 			seen_seq = log_next_seq;
 		}
 
@@ -2350,8 +2348,8 @@ void console_unlock(void)
 	if (retry && console_trylock())
 		goto again;
 
-	if (wake_klogd || cont.len)
-		wake_up_klogd();
+	if (poke_loggers || cont.len)
+		printk_poke(PRINTK_POKE_LOGGERS);
 }
 EXPORT_SYMBOL(console_unlock);
 
@@ -2730,40 +2728,45 @@ static int __init printk_late_init(void)
 
 #if defined CONFIG_PRINTK
 /*
- * Delayed printk version, for scheduler-internal messages:
+ * Poke console and loggers via IRQ work so that it can be done
+ * from any context.
  */
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
-
-static DEFINE_PER_CPU(int, printk_pending);
+static DEFINE_PER_CPU(int, printk_need_poke);
 
-static void wake_up_klogd_work_func(struct irq_work *irq_work)
+static void printk_poke_func(struct irq_work *irq_work)
 {
-	int pending = __this_cpu_xchg(printk_pending, 0);
+	int poke = __this_cpu_xchg(printk_need_poke, 0);
 
-	if (pending & PRINTK_PENDING_OUTPUT) {
+	if (poke & PRINTK_POKE_CONSOLE) {
 		/* If trylock fails, someone else is doing the printing */
 		if (console_trylock())
 			console_unlock();
 	}
 
-	if (pending & PRINTK_PENDING_WAKEUP)
+	if (poke & PRINTK_POKE_LOGGERS)
 		wake_up_interruptible(&log_wait);
+
 	deferred_cont_flush();
 }
 
-static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
-	.func = wake_up_klogd_work_func,
+static DEFINE_PER_CPU(struct irq_work, printk_poke_work) = {
+	.func = printk_poke_func,
 	.flags = IRQ_WORK_LAZY,
 };
 
-void wake_up_klogd(void)
+void printk_poke(int poke)
 {
 	preempt_disable();
-	if (waitqueue_active(&log_wait)) {
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
-	}
+
+	if (poke & PRINTK_POKE_CONSOLE)
+		this_cpu_or(printk_need_poke, PRINTK_POKE_CONSOLE);
+
+	if ((poke & PRINTK_POKE_LOGGERS) && waitqueue_active(&log_wait))
+		this_cpu_or(printk_need_poke, PRINTK_POKE_LOGGERS);
+
+	if (this_cpu_read(printk_need_poke))
+		irq_work_queue(this_cpu_ptr(&printk_poke_work));
+
 	preempt_enable();
 }
 
@@ -2777,8 +2780,7 @@ int printk_deferred(const char *fmt, ...)
 	r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
+	printk_poke(PRINTK_POKE_CONSOLE);
 	preempt_enable();
 
 	return r;
@@ -2893,6 +2895,7 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 {
 	struct kmsg_dumper *dumper;
 	unsigned long flags;
+	int did_flush = 0;
 
 	if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
 		return;
@@ -2906,7 +2909,7 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 		dumper->active = true;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		cont_flush();
+		did_flush |= cont_flush();
 		dumper->cur_seq = clear_seq;
 		dumper->cur_idx = clear_idx;
 		dumper->next_seq = log_next_seq;
@@ -2920,6 +2923,9 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 		dumper->active = false;
 	}
 	rcu_read_unlock();
+
+	if (did_flush)
+		printk_poke(PRINTK_POKE_CONSOLE);
 }
 
 /**
@@ -2997,7 +3003,8 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog,
 	bool ret;
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	cont_flush();
+	if (cont_flush())
+		printk_poke(PRINTK_POKE_CONSOLE);
 	ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
@@ -3039,7 +3046,9 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	cont_flush();
+	if (cont_flush())
+		printk_poke(PRINTK_POKE_CONSOLE);
+
 	if (dumper->cur_seq < log_first_seq) {
 		/* messages are gone, move to first available one */
 		dumper->cur_seq = log_first_seq;
diff --git a/lib/bust_spinlocks.c b/lib/bust_spinlocks.c
index f8e0e5367398..2bd31977058f 100644
--- a/lib/bust_spinlocks.c
+++ b/lib/bust_spinlocks.c
@@ -26,6 +26,6 @@ void __attribute__((weak)) bust_spinlocks(int yes)
 #endif
 		console_unblank();
 		if (--oops_in_progress == 0)
-			wake_up_klogd();
+			printk_poke(PRINTK_POKE_LOGGERS);
 	}
 }
-- 
1.8.5.6

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ