[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0903251826390.3032@localhost.localdomain>
Date: Wed, 25 Mar 2009 18:34:32 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Jan Kara <jack@...e.cz>
cc: Theodore Tso <tytso@....edu>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Arjan van de Ven <arjan@...radead.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nick Piggin <npiggin@...e.de>,
Jens Axboe <jens.axboe@...cle.com>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
On Thu, 26 Mar 2009, Jan Kara wrote:
>
> 1) We have to writeout blocks full of zeros on allocation so that we don't
> expose unallocated data => slight slowdown
Why?
This is in _no_ way different from a regular "write()" system call. And
there, we just attach the buffers to the page. If something crashes before
the page actually gets written out, then we'll have hopefully never
written out the metadata (that's what "data=ordered" means).
> 2) When blocksize < pagesize we must play nasty tricks for this to work
> (think about i_size = 1024, set_page_dirty(), truncate(f, 8192),
> writepage() -> uhuh, not enough space allocated)
Good point. I suspect not enough people have played around with
"set_page_dirty()" to find these kinds of things. The VFS layer probably
doesn't help sufficiently with the half-dirty pages, although the FS can
obviously always look up the previously last page and do things manually
if it wants to.
But yes, this is nasty.
> 3) We'll do allocation in the order in which pages are dirtied. Generally,
> I'd suspect this order to be less linear than the order in which writepages
> submit IO and thus it will result in the larger fragmentation of the file.
> So it's not a clear win IMHO.
Yes, that may be the case.
Of course, the approach of just checking whether the buffer heads already
exists and are mapped (before bothering with anything else) probably works
fine in practice. In most loads, pages will have been dirtied by regular
"write()" system calls, and then we will have the buffers pre-allocated
regardless.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists