lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190509120903.28939-1-daniel.vetter@ffwll.ch>
Date:   Thu,  9 May 2019 14:09:03 +0200
From:   Daniel Vetter <daniel.vetter@...ll.ch>
To:     Intel Graphics Development <intel-gfx@...ts.freedesktop.org>
Cc:     DRI Development <dri-devel@...ts.freedesktop.org>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Daniel Vetter <daniel.vetter@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        John Ogness <john.ogness@...utronix.de>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        linux-kernel@...r.kernel.org
Subject: [PATCH] RFC: console: hack up console_lock more v3

console_trylock, called from within printk, can be called from pretty
much anywhere. Including try_to_wake_up. Note that this isn't common,
usually the box is in pretty bad shape at that point already. But it
really doesn't help when then lockdep jumps in and spams the logs,
potentially obscuring the real backtrace we're really interested in.
One case I've seen (slightly simplified backtrace):

 Call Trace:
  <IRQ>
  console_trylock+0xe/0x60
  vprintk_emit+0xf1/0x320
  printk+0x4d/0x69
  __warn_printk+0x46/0x90
  native_smp_send_reschedule+0x2f/0x40
  check_preempt_curr+0x81/0xa0
  ttwu_do_wakeup+0x14/0x220
  try_to_wake_up+0x218/0x5f0
  pollwake+0x6f/0x90
  credit_entropy_bits+0x204/0x310
  add_interrupt_randomness+0x18f/0x210
  handle_irq+0x67/0x160
  do_IRQ+0x5e/0x130
  common_interrupt+0xf/0xf
  </IRQ>

This alone isn't a problem, but the spinlock in the semaphore is also
still held while waking up waiters (up() -> __up() -> try_to_wake_up()
callchain), which then closes the runqueue vs. semaphore.lock loop,
and upsets lockdep, which issues a circular locking splat to dmesg.
Worse it upsets developers, since we don't want to spam dmesg with
clutter when the machine is dying already.

Fix this by creating a prinkt_safe_up() which calls wake_up_process
outside of the spinlock. This isn't correct in full generality, but
good enough for console_lock:

- console_lock doesn't use interruptible or killable or timeout down()
  calls, hence an up() is the only thing that can wake up a process.
  Hence the process can't get woken and killed and reaped while we try
  to wake it up too.

- semaphore.c always updates the waiter list while under the spinlock,
  so there's no other races. Specifically another process that races
  with a quick console_lock/unlock while we've dropped the spinlock
  already won't see our own waiter.

Note that we only have to break the recursion for the semaphore.lock
spinlock of the console_lock. Recursion within various scheduler
related locks is already prevented by the printk_safe_enter/exit pair
in __up_console_sem().

Also cc'ing John Ogness since perhaps his printk rework fixes this all
properly.

v2: Ditch attempt to fix console_trylock.

v3: Add a comment explaining why the taks we're waking won't
disappear (Chris), and improve commit message to address review
questions.

Signed-off-by: Daniel Vetter <daniel.vetter@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Will Deacon <will.deacon@....com>
Cc: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Daniel Vetter <daniel.vetter@...ll.ch>
Cc: John Ogness <john.ogness@...utronix.de>
Cc: Chris Wilson <chris@...is-wilson.co.uk>
Cc: linux-kernel@...r.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@...ll.ch>
---
 include/linux/semaphore.h  |  1 +
 kernel/locking/semaphore.c | 31 +++++++++++++++++++++++++++++++
 kernel/printk/printk.c     |  2 +-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
index 11c86fbfeb98..7e839c72809d 100644
--- a/include/linux/semaphore.h
+++ b/include/linux/semaphore.h
@@ -41,6 +41,7 @@ extern int __must_check down_interruptible(struct semaphore *sem);
 extern int __must_check down_killable(struct semaphore *sem);
 extern int __must_check down_trylock(struct semaphore *sem);
 extern int __must_check down_timeout(struct semaphore *sem, long jiffies);
+extern void printk_safe_up(struct semaphore *sem);
 extern void up(struct semaphore *sem);
 
 #endif /* __LINUX_SEMAPHORE_H */
diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 561acdd39960..55a896f18d91 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -197,6 +197,37 @@ struct semaphore_waiter {
 	bool up;
 };
 
+/**
+ * printk_safe_up - release the semaphore in console_unlock
+ * @sem: the semaphore to release
+ *
+ * Release the semaphore.  Unlike mutexes, up() may be called from any
+ * context and even by tasks which have never called down().
+ *
+ * NOTE: This is a special version of up() for console_unlock only. It is only
+ * safe if there are no killable, interruptible or timing out down() calls.
+ */
+void printk_safe_up(struct semaphore *sem)
+{
+	unsigned long flags;
+	struct semaphore_waiter *waiter = NULL;
+
+	raw_spin_lock_irqsave(&sem->lock, flags);
+	if (likely(list_empty(&sem->wait_list))) {
+		sem->count++;
+	} else {
+		waiter = list_first_entry(&sem->wait_list,
+					  struct semaphore_waiter, list);
+		list_del(&waiter->list);
+		waiter->up = true;
+	}
+	raw_spin_unlock_irqrestore(&sem->lock, flags);
+
+	if (waiter) /* protected by being sole wake source */
+		wake_up_process(waiter->task);
+}
+EXPORT_SYMBOL(printk_safe_up);
+
 /*
  * Because this function is inlined, the 'state' parameter will be
  * constant, and thus optimised away by the compiler.  Likewise the
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02ca827b8fac..62303929afda 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -234,7 +234,7 @@ static void __up_console_sem(unsigned long ip)
 	mutex_release(&console_lock_dep_map, 1, ip);
 
 	printk_safe_enter_irqsave(flags);
-	up(&console_sem);
+	printk_safe_up(&console_sem);
 	printk_safe_exit_irqrestore(flags);
 }
 #define up_console_sem() __up_console_sem(_RET_IP_)
-- 
2.20.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ