[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CC1B094.3090403@gmail.com>
Date: Fri, 22 Oct 2010 11:41:08 -0400
From: Ric Wheeler <ricwheeler@...il.com>
To: Eric Sandeen <sandeen@...hat.com>
CC: Lukas Czerner <lczerner@...hat.com>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, tytso@....edu
Subject: Re: [PATCH] e2fsck: Discard free data and inode blocks.
On 10/22/2010 11:37 AM, Eric Sandeen wrote:
> Ric Wheeler wrote:
>
> ...
>
>>> Well, so far the only breakages I have seen was with lots of small TRIMs
>>> (or UNMAPs, etc) issued in random pattern, never in case of mkfs which
>>> is quite a opposite - big sequential ranges.
>>>
>>> Hangs should be covered by those two patches:
>>>
>>> http://marc.info/?l=linux-ext4&m=128774558623608&w=2
>>> http://marc.info/?l=linux-ext4&m=128767099123375&w=2
>>>
>>> if, of course, they get upstream. Also there is a big win, when discard
>>> also zeroes data, because in that case we can just skip inode table
>>> initialization (zeroing) without any need of in-kernel lazyinit code
>>> enabled. And we get all this for free. It was introduced with Sandeens
>>> patch:
>>>
>>> http://marc.info/?l=linux-ext4&m=128234048208327&w=2
>>>
>>> So, I would rather leave it on by default.
>>>
>>> -Lukas
>> You cannot 100% depend on discard zeroing blocks - that is not a
>> universal requirement of devices that support it. Specifically, for ATA
>> devices, I think that there are optional bits that specify how a device
>> will behave when you read from a trimmed region.
> But don't we have the ability to test whether discard -does- zero blocks,
> as advertised by the device? And honestly if the device mis-reports, that
> sounds like a device vendor problem to fix.
>
> The proposal wasn't to discard and assume zero, but to check for that
> behavior:
>
> http://kerneltrap.org/mailarchive/linux-ext4/2010/9/21/6885628/thread
>
> + if (!retval&& mke2fs_discard_zeroes_data(fs)) {
> + if (verbose)
> + printf(_("Discard succeeded and will return 0s "
> + " - enabling lazy_itable_init\n"));
> + lazy_itable_init = 1;
> + lazy_itable_zeroed = 1;
> + }
>
> so we're not depending on it zeroing blocks, we're just depending on it
> advertising correctly whether or not it -does- zero.
>
> -Eric
>
>
I think that ATA devices have historically not done this correctly, but the T13
committee is working on it. The question is whether the bit we check and rely on
has the right semantics (and then if the device will reliably implement this).
Historically, array vendors did rely on SCSI commands like the old fashioned
"WRITE_SAME" to initialize storage for them, but that takes a *long* time to run :)
Ric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists