[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141015012534.GB12013@birch.djwong.org>
Date: Tue, 14 Oct 2014 18:25:34 -0700
From: "Darrick J. Wong" <darrick.wong@...cle.com>
To: "Theodore Ts'o" <tytso@....edu>
Cc: Dave Chinner <david@...morbit.com>, Jens Axboe <axboe@...nel.dk>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
linux-fsdevel@...r.kernel.org,
linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: BLKZEROOUT + pread should return zeroes, right?
On Tue, Oct 14, 2014 at 02:32:10AM -0400, Theodore Ts'o wrote:
> The bottom line is for most of the use cases we are talking about,
> we're only zero'ing one or two 4k blocks at a time, so I've never been
> convinced that it's worth it to use BLKZEROOUT.
>
> We could add page cache coherency features to BLKZEROOUT, but I'm not
> entirely sure it's worth the effort. No user space program would be
> able to take advantage of adding coherency for several years, or
Well then let's change BLKZEROOUT to require O_DIRECT instead of hiding the
coherency problem, and introduce BLKZEROOUT_INV which issues the zero out and
then takes care of page cache coherency.
(Or at least the first part...)
> adding feature tests, etc., and is it worth the upside of being able
> to use WRITE SAME for a few 4k or 8k writes? (Which the vast majority
> of storage devices don't support anyway....)
I've converted mke2fs and e2fsck to use BLKZEROOUT to zero the journal and the
inode tables when they want something to really be zero, and ext2fs_fallocate
uses it to zero the fallocated range. I suspect those three will zero long
runs of sectors each call.
As for WRITE_SAME support, if it's there, why ignore it? The ioctl exists;
someone else is bound to use it sooner or later.
A further optimization to mke2fs would be to detect that we've run
discard-with-zeroes and therefore can skip issuing subsequent zeroouts on the
same ranges, but I'm wary that discard-zeroes-data does what it purports to do.
If it /does/ work reliably, though, ext2fs_zero_blocks() could be rerouted to
use discard instead. Really my reason for wanting to use zeroout is that in
guaranteeing the zero-read behavior afterwards it seems like it ought to be
less problematic than discard has been.
--D
>
> Cheers,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists