[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090331100150.GF11808@duck.suse.cz>
Date: Tue, 31 Mar 2009 12:01:51 +0200
From: Jan Kara <jack@...e.cz>
To: Alexander Beregalov <a.beregalov@...il.com>
Cc: Theodore Tso <tytso@....edu>,
"linux-next@...r.kernel.org" <linux-next@...r.kernel.org>,
linux-ext4@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
sparclinux@...r.kernel.org
Subject: Re: next-20090310: ext4 hangs
On Thu 26-03-09 01:38:32, Alexander Beregalov wrote:
> 2009/3/25 Jan Kara <jack@...e.cz>:
> > On Wed 25-03-09 20:07:46, Alexander Beregalov wrote:
> >> 2009/3/25 Jan Kara <jack@...e.cz>:
> >> > On Wed 25-03-09 18:29:10, Alexander Beregalov wrote:
> >> >> 2009/3/25 Jan Kara <jack@...e.cz>:
> >> >> > On Wed 25-03-09 18:18:43, Alexander Beregalov wrote:
> >> >> >> 2009/3/25 Jan Kara <jack@...e.cz>:
> >> >> >> >> > So, I think I need to try it on 2.6.29-rc7 again.
> >> >> >> >> I've looked into this. Obviously, what's happenning is that we delete
> >> >> >> >> an inode and jbd2_journal_release_jbd_inode() finds inode is just under
> >> >> >> >> writeout in transaction commit and thus it waits. But it gets never woken
> >> >> >> >> up and because it has a handle from the transaction, every one eventually
> >> >> >> >> blocks on waiting for a transaction to finish.
> >> >> >> >> But I don't really see how that can happen. The code is really
> >> >> >> >> straightforward and everything happens under j_list_lock... Strange.
> >> >> >> > BTW: Is the system SMP?
> >> >> >> No, it is UP system.
> >> >> > Even stranger. And do you have CONFIG_PREEMPT set?
> >> >> >
> >> >> >> The bug exists even in 2.6.29, I posted it with a new topic.
> >> >> > OK, I've sort-of expected this.
> >> >>
> >> >> CONFIG_PREEMPT_RCU=y
> >> >> CONFIG_PREEMPT_RCU_TRACE=y
> >> >> # CONFIG_PREEMPT_NONE is not set
> >> >> # CONFIG_PREEMPT_VOLUNTARY is not set
> >> >> CONFIG_PREEMPT=y
> >> >> CONFIG_DEBUG_PREEMPT=y
> >> >> # CONFIG_PREEMPT_TRACER is not set
> >> >>
> >> >> config is attached.
> >> > Thanks for the data. I still don't see how the wakeup can get lost. The
> >> > process even cannot be preempted when we are in the section protected by
> >> > j_list_lock... Can you send me a disassembly of functions
> >> > jbd2_journal_release_jbd_inode() and journal_submit_data_buffers() so that
> >> > I can see whether the compiler has not reordered something unexpectedly?
> > Thanks for the disassembly...
> >
> >> By default gcc inlines journal_submit_data_buffers()
> >> Here is -fno-inline version. Default version is in attach.
<snip>
I'm helpless here. I don't see how we can miss a wakeup (plus you seem to
be the only one reporting the bug). Could you please compile and test the kernel
with the attached patch? It will print to kernel log when we go to sleep
waiting for inode commit and when we send wakeups etc. When you hit the
deadlock, please send me your kernel log. It should help with debugging why do
we miss the wakeup. Thanks.
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists