[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090325235041.GA11024@duck.suse.cz>
Date: Thu, 26 Mar 2009 00:50:41 +0100
From: Jan Kara <jack@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Theodore Tso <tytso@....edu>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Arjan van de Ven <arjan@...radead.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nick Piggin <npiggin@...e.de>,
Jens Axboe <jens.axboe@...cle.com>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
On Wed 25-03-09 16:21:56, Linus Torvalds wrote:
> On Wed, 25 Mar 2009, Theodore Tso wrote:
> >
> > Um, no, ext3 shouldn't block on writepage(). Since it doesn't do
> > delayed allocation, it should always be able to push out a dirty page
> > to the disk.
>
> Umm. Maybe I'm mis-reading something, but they seem to all synchronize
> with the journal with "ext3_journal_start/stop".
>
> Which will at a minimum wait for 'j_barrier_count == 0' and 't_state !=
> T_LOCKED'. Along with making sure that there are enough transaction
> buffers.
>
> Do I understand _why_ ext3 does that? Hell no. The code makes no sense to
> me. But I don't think I'm wrong.
>
> Look at the sane case (data=ordered): it still does
>
> handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
> ...
> err = ext3_journal_stop(handle);
>
> around all the IO starting. Never mind that the IO shouldn't be needing
> any journal activity at all afaik in any common case.
>
> Yes, yes, it may need to allocate backing store (a page that was dirtied
> by mmap), and I'm sure that's the reason for it all, but the point is,
> most of the time there should be no journal activity at all, yet it looks
> very much like a simple writepage() will synchronize with a full journal
> and wait for the journal to get space.
>
> No?
Yes, you got it right. Furthermore in ordered mode we need to attach
buffers to the running transaction if they aren't there (but for checking
whether they are we need to pin the running transaction and we are
basically where we started.. damn). But maybe there's a way out of it.
We don't have to guarantee data written via mmap are on disk when "the
transaction running when somebody decided to call writepage" commits (in
case no block allocation happen) and so we could just submit those buffers
for IO and don't attach them to the transaction...
> So tell me again how the VM can rely on the filesystem not blocking at
> random points.
I can write a patch to make writepage() in the non-"mmapped creation"
case non-blocking on journal. But I'll also have to find out whether it
really helps something. But it's probably worth trying...
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists