[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121210182053.GC7516@thunk.org>
Date: Mon, 10 Dec 2012 13:20:53 -0500
From: Theodore Ts'o <tytso@....edu>
To: Steven Whitehouse <swhiteho@...hat.com>,
Dave Chinner <david@...morbit.com>,
Chris Mason <chris.mason@...ionio.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ric Wheeler <rwheeler@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
Christoph Hellwig <hch@...radead.org>,
Martin Steigerwald <Martin@...htvoll.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate
UAPI
A sentence or two got chopped out during an editing pass. Let me try
that again so it's a bit clearer what I was trying to say....
Sure, but if the block device supports WRITE_SAME or persistent
discard, then presumably fallocate() should do this automatically all
the time, and not require a flag to request this behavior. The only
reason why you might not is if the WRITE_SAME is more costly. That is
when a seek plus writing 1MB does take more time than the amount of
disk time fraction that it consumes if you compare it to a seek plus
writing 4k or 32k.
Ext4 currently uses a threshold of 32k for this break point (below
that, we will use sb_issue_zeroout; above that, we will break apart an
uninitialized extent when writing into a preallocated region). It may
be that 32k is too low, especailly for certain types of devices (i.e.,
SSD's versus RAID 5, where it should be aligned on a RAID strip,
etc.). More of an issue might be that there will be some disagreement
about whether people want to the system to automatically tune for
average throughput vs 99.9 percentile latency.
Regardless, this is actually something which I think the file system
should try to do automatically if at all possible, via some kind of
auto-tuning hueristic, instead of using an explicit fallocate(2) flag.
(See, I don't propose using a new fallocate flag for everything. :-)
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists