[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090526204703.GM11363@kernel.dk>
Date: Tue, 26 May 2009 22:47:04 +0200
From: Jens Axboe <jens.axboe@...cle.com>
To: Damien Wyart <damien.wyart@...e.fr>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
chris.mason@...cle.com, david@...morbit.com, hch@...radead.org,
akpm@...ux-foundation.org, jack@...e.cz,
yanmin_zhang@...ux.intel.com, richard@....demon.co.uk
Subject: Re: [PATCH 0/12] Per-bdi writeback flusher threads v7
On Tue, May 26 2009, Damien Wyart wrote:
> > > I have been playing with v7 since your sending and after a while
> > > (short on laptop, longer on desktop, a few hours), writeback doesn't
> > > seem to work anymore. Manual call to sync hangs (process in D state)
> > > and Dirty value in meminfo gets growing. As previous versions had
> > > been heavily tested, I guess there is some regression in v7.
>
> > Not good, the prime suspect is the sync notification stuff. I'll take
> > a look and get that fixed. You didn't happen to catch any sysrq-t back
> > traces or anything like that? Would be interesting to see where
> > bdi-default and the bdi-* threads are stuck.
>
> No, as I was doing many things at the same time and not exclusively
> debugging, I just rebooted hard and went back to an upatched kernel when
> the problems occured. But I noticed only bdi-default was alive, the
> other bdi-* threads had disappeared and the sync commands I had tried
> were all in D state. Also I tried to reinstall a kernel .deb (these
> systems are Debian) and this got stuck guring installation, when probing
> grub config (do not know if there is some sync syscall inthere).
>
> Can try to go further tomorrow but will not have a lot of time...
OK, I spotted the problem. If we fallback to the on-stack allocation in
bdi_writeback_all(), then we do the wait for the work completion with
the bdi_lock mutex held. This can deadlock with bdi_forker_task(), so if
we require that to be invoked to make progress (happens if a thread
needs to be restarted), then we have a deadlock on that mutex.
I'll cook up a fix for this, but probably not before the morning.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists