[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101109044242.GH2715@dastard>
Date: Tue, 9 Nov 2010 15:42:42 +1100
From: Dave Chinner <david@...morbit.com>
To: Ted Ts'o <tytso@....edu>, Josef Bacik <josef@...hat.com>,
linux-kernel@...r.kernel.org, linux-btrfs@...r.kernel.org,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
xfs@....sgi.com, joel.becker@...cle.com, cmm@...ibm.com,
cluster-devel@...hat.com
Subject: Re: [PATCH 1/6] fs: add hole punching to fallocate
On Mon, Nov 08, 2010 at 10:30:38PM -0500, Ted Ts'o wrote:
> On Tue, Nov 09, 2010 at 12:12:22PM +1100, Dave Chinner wrote:
> > Hole punching was not included originally in fallocate() for a
> > variety of reasons. IIRC, they were along the lines of:
> >
> > 1 de-allocating of blocks in an allocation syscall is wrong.
> > People wanted a new syscall for this functionality.
....
> > I guess that leaves #1 to be debated;
> > I don't think there is any problem with doing what you propose.
>
> I don't have a problem either.
>
> As a completely separate proposal, what do people think about an
> FALLOCATE_FL_ZEROIZE after which time the blocks are allocated, but
> reading from them returns zero.
That's exactly the new XFS_IOC_ZERO_RANGE ioctl in 2.6.36 does
(commit 447223520520b17d3b6d0631aa4838fbaf8eddb4 "xfs: Introduce
XFS_IOC_ZERO_RANGE") The git commit I pointed to in the last email
is the rudimentary fallocate() interface support I have for that
code which goes along with an xfs_io patch I have. Given that there
seems to be interest for this operation, I'll flesh it out into a
proper patch....
> This could be done either by (a)
> sending a discard in the case of devices where discard_zeros_data is
> true and discard_granularty is less than the fs block size, or (b) by
> setting the uninitialized flag in the extent tree.
Implementation is up to the filesystem. However, XFS does (b)
because:
1) it was extremely simple to implement (one of the
advantages of having an exceedingly complex allocation
interface to begin with :P)
2) conversion is atomic, fast and reliable
3) it is independent of the underlying storage; and
4) reads of unwritten extents operate at memory speed,
not disk speed.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists