lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1342586692.7321.45.camel@marge.simpson.net>
Date:	Wed, 18 Jul 2012 06:44:52 +0200
From:	Mike Galbraith <efault@....de>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Jan Kara <jack@...e.cz>, Jeff Moyer <jmoyer@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
	Jens Axboe <jaxboe@...ionio.com>, mgalbraith@...e.com,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: Deadlocks due to per-process plugging

(adds rather important missing Cc)

On Tue, 2012-07-17 at 15:10 +0200, Mike Galbraith wrote: 
> On Mon, 2012-07-16 at 12:19 +0200, Thomas Gleixner wrote:
> 
> > > @@ -647,8 +648,11 @@ static inline void rt_spin_lock_fastlock
> > >  
> > >  	if (likely(rt_mutex_cmpxchg(lock, NULL, current)))
> > >  		rt_mutex_deadlock_account_lock(lock, current);
> > > -	else
> > > +	else {
> > > +		if (blk_needs_flush_plug(current))
> > > +			blk_schedule_flush_plug(current);
> > >  		slowfn(lock);
> > > +	}
> > 
> > That should do the trick.
> 
> Box has been grinding away long enough now to agree that it did.
> 
> rt: pull your plug before blocking

Hm.  x3550 seems to have lost interest in nearly instant gratification
ext4 deadlock testcase: taskset -c 3 dbench -t 30 -s 8 in enterprise.
Previously, it _might_ have survived one 30 second test, but never for
minutes, much less several minutes of very many threads, so it appears
to have been another flavor of IO dependency deadlock. 

I just tried virgin 3.4.4-rt13, and it too happily churned away.. until
I tried dbench -t 300 -s 500 that is.  That (seemingly 100% repeatably)
makes rcu stall that doesn't get to serial console, nor will my virgin
source/config setup crash dump.  Damn.  Enterprise kernel will dump, but
won't stall, so I guess I'd better check out the other virgin 3.x-rt
trees to at least narrow down where stall started.

Whatever, RCU stall is a different problem.  Revert unplug patchlet, and
ext4 deadlock is back in virgin 3.4-rt, so methinks it's sufficiently
verified that either we need some form of unplug before blocking, or we
need a pull your plug point is at least two filesystems, maybe more.

-Mike

The patch in question for missing Cc.  Maybe should be only mutex, but I
see no reason why IO dependency can only possibly exist for mutexes...

rt: pull your plug before blocking

Queued IO can lead to IO deadlock should a task require wakeup from as task
which is blocked on that queued IO.

ext3: dbench1 queues a buffer, blocks on journal mutex, it's plug is not
pulled.  dbench2 mutex owner is waiting for kjournald, who is waiting for
the buffer queued by dbench1.  Game over.

Signed-off-by: Mike Galbraith <efault@....de>

diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index 3bff726..3f6ae32 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -20,6 +20,7 @@
 #include <linux/export.h>
 #include <linux/sched.h>
 #include <linux/timer.h>
+#include <linux/blkdev.h>
 
 #include "rtmutex_common.h"
 
@@ -647,8 +648,11 @@ static inline void rt_spin_lock_fastlock(struct rt_mutex *lock,
 
 	if (likely(rt_mutex_cmpxchg(lock, NULL, current)))
 		rt_mutex_deadlock_account_lock(lock, current);
-	else
+	else {
+		if (blk_needs_flush_plug(current))
+			blk_schedule_flush_plug(current);
 		slowfn(lock);
+	}
 }
 
 static inline void rt_spin_lock_fastunlock(struct rt_mutex *lock,
@@ -1104,8 +1108,11 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state,
 	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
-	} else
+	} else {
+		if (blk_needs_flush_plug(current))
+			blk_schedule_flush_plug(current);
 		return slowfn(lock, state, NULL, detect_deadlock);
+	}
 }
 
 static inline int



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ