[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100224145642.GJ3687@quack.suse.cz>
Date: Wed, 24 Feb 2010 15:56:43 +0100
From: Jan Kara <jack@...e.cz>
To: Dave Chinner <david@...morbit.com>
Cc: tytso@....edu, Jan Kara <jack@...e.cz>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jens Axboe <jens.axboe@...cle.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
jengelh@...ozas.de, stable@...nel.org, gregkh@...e.de
Subject: Re: [PATCH] writeback: Fix broken sync writeback
On Tue 23-02-10 16:53:35, Dave Chinner wrote:
> > > > This is done to avoid a lock inversion, and so this is an
> > > > ext4-specific thing (at least I don't think XFS's delayed allocation
> > > > has this misfeature).
> > >
> > > Not that I know of, but then again I don't know what inversion ext4
> > > is trying to avoid. Can you describe the inversion, Ted?
> >
> > The locking order is journal_start_handle (starting a micro
> > transaction in jbd) -> lock_page. A more detailed description of why
> > this locking order is non-trivial for us to fix in ext4 can be found
> > in the description of commit f0e6c985.
>
> Nasty - you need to start a transaction before you lock pages for
> writeback and allocation, but ->writepage hands you a locked page.
> And you can't use an existing transaction handle open because you
> can't guarantee that you have journal credits reserved for the
> allocation?
Exactly.
> IIUC, ext3/4 has this problem due to the ordered data writeback
> constraints, right?
Not quite. I don't know how XFS solves this but in ext3/4 starting
a transaction can block (waiting for journal space) until all users of
a previous transaction are done with it and we can commit it. Thus
the transaction start / stop behave just as an ordinary lock. Because
you need a transaction started when writing a page (for metadata updates)
there is some lock ordering forced between a page lock and a trasaction
start / stop. Ext4 chose it to be transaction -> page lock (which makes
writepages more efficient and writepage hard), ext3 has page lock ->
transaction (so it has working ->writepage).
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists