linux-kernel - Re: [RFC PATCH 3/5] printk/nmi: Try hard to print Oops message in NMI context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20150911152258.GC11137@pathway.suse.cz>
Date:	Fri, 11 Sep 2015 17:22:58 +0200
From:	Petr Mladek <pmladek@...e.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>, jkosina@...e.cz,
	paulmck@...ux.vnet.ibm.com, Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 3/5] printk/nmi: Try hard to print Oops message in
 NMI context

On Tue 2015-07-14 10:20:11, Peter Zijlstra wrote:
> On Mon, Jul 13, 2015 at 04:14:19PM -0700, Andrew Morton wrote:
> > On Thu, 9 Jul 2015 18:21:54 +0200 Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> > > On Thu, Jul 09, 2015 at 03:36:20PM +0200, Petr Mladek wrote:
> > > > +	/*
> > > > +	 * Messages are passed from NMI context using an extra buffer.
> > > > +	 * The only exception is when Oops is in progress. In this case
> > > > +	 * we try hard to get them out directly.
> > > > +	 */
> > > > +	if (unlikely(oops_in_progress && in_nmi()))
> > > > +		zap_locks();
> > > 
> > > zap_locks() is broken and horrible and is another thing we should take
> > > out back.
> > > 
> > > Imagine a ticket lock that's held, you zap it, it gets released and now
> > > its tail is ahead of its head and you're forever stuck.
> > > 
> > > I've actually had that happen several times.
> > 
> > Is that really a problem?  The thinking is that the system is already
> > hosed.  The role of zap_locks is to increase the chances of getting the
> > why-it-died messages emitted.  After that, nothing matters?
> 
> I've had zap_locks eat my msgs.. doesn't happen often, but it did happen
> a few times and got me upset.

Hmm, zapping locks might cause problems if the old owner of the lock
is still active. But the alternative solution where Peter suggested to
write messages directly on safe consoles might open another can of
problems either.

If the main concern with zap_locks() is the potential deadlock with
ticket locks. I wonder if the following patch might make it more
acceptable. I hope that my thinking is correct.


>From 674f1a7603ad25d94f76c1d027cd74fee4317915 Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@...e.com>
Date: Fri, 11 Sep 2015 16:49:46 +0200
Subject: [PATCH 1/2] x86/spinlocks: Avoid a deadlock when someone unlock a
 zapped ticked spinlock

There are few situations when we reinitialize (zap) ticket spinlocks. It
typically happens when the system is going down after an error and we
want to avoid deadlock in some important services. For example,
zap_locks() in printk.c and ioapic_zap_locks().

Peter pointed out that partial deadlock was still possible. It happens
when someone owns a ticket spinlock, we reinitialize it, and the old
owner releases it. Then the head is above the tail and the following
spin_lock() will never[*] succeed.

We could detect this situation in arch_spin_lock() and simply ignore
the superfluous head increment.

We need to do it in the lock() side because the unlock() side works
only with the head to avoid an overflow. Therefore we do not see
the consistent state of the head and the tail there.

Note that we could not check for (head == TICKET_LOCK_INC && !tail)
because the reinitialized lock might be taken several times before
the old owner releases the lock. By other words, the superfluous
head increment might happen at any time.

The change looks quite harmless. It should not affect the fast path
when the lock is taken immediately. It does not make worse the
situation when two processes might own the lock after zapping.
It just avoids the partial deadlock.

[*] unless the ticket number overflows.

Reported-by: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Petr Mladek <pmladek@...e.com>
---
 arch/x86/include/asm/spinlock.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index be0a05913b91..f732abf57c6f 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -105,12 +105,21 @@ static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
  */
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	register struct __raw_tickets inc = { .tail = TICKET_LOCK_INC };
+	register struct __raw_tickets inc;
 
+again:
+	inc = (struct __raw_tickets){ .tail = TICKET_LOCK_INC };
 	inc = xadd(&lock->tickets, inc);
 	if (likely(inc.head == inc.tail))
 		goto out;
 
+	/*
+	 * Avoid a stall when an old owner unlocked a reinitialized spinlock.
+	 * Simply ignore the superfluous increment of the head.
+	 */
+	if (unlikely(inc.head == inc.tail + TICKET_LOCK_INC))
+		goto again;
+
 	for (;;) {
 		unsigned count = SPIN_THRESHOLD;
 
-- 
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/