[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101119013301.GU3290@thunk.org>
Date: Thu, 18 Nov 2010 20:33:01 -0500
From: Ted Ts'o <tytso@....edu>
To: Mark Lord <kernel@...savvy.com>
Cc: James Bottomley <James.Bottomley@...e.de>,
Greg Freemyer <greg.freemyer@...il.com>,
Jeff Moyer <jmoyer@...hat.com>,
Christoph Hellwig <hch@...radead.org>,
Matthew Wilcox <matthew@....cx>,
Josef Bacik <josef@...hat.com>,
Lukas Czerner <lczerner@...hat.com>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, sandeen@...hat.com
Subject: Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separate
super_operation
> >
> >Before we go gung ho on this, there's no evidence that N discontiguous
> >ranges in one command are any better than the ranges sent N times ...
> >the same amount of erase overhead gets sent on SSDs.
>
> No, we do have evidence: execution time of the TRIM commands on the SSD.
>
> The one-range-at-a-time is incredibly slow compared to multiple
> ranges at a time. That slowness comes from somewhere, with about
> 99.9% certainty that it is due to the drive performing slow flash
> erase cycles.
Mark, I think you are over-generalizing here. You have observed with
some number of flash drives --- maybe only one, but I don't know that
for sure --- that TRIM is slow. Even if we grant that you are correct
in your conclusion that it is because the drive is doing slow flash
erase cycles (and I don't completely accept that; I haven't seen your
your measurements since we know that any kind of command that requires
a queue drain/flush before it can execute is going to be slow, and I
don't know what kind of _slow_ you are observing).
But even if we *do* grant that you've seen one disk, or even a lot of
disks which is doing something stupid, that just means that their
manufacturer has some idiotic engineers. It does not follow that all
SSD's, or thin-provisioned drives, or other devices implementing the
the ATA TRIM command, will do so in an incompetent way.
If you look a the the T13 definition of TRIM, it is just a hint that
the contents of the block range do not _have_ to be preserved. It
does not say that they *must* be erased. This is not a security erase
command. In fact, it is perfectly reasonable for the TRIM command to
store state in volatile storage, and the information of which blocks
have been TRIM gets discarded on a power failure.
So if SSD's are doing a full flash erase cycle for each TRIM, that may
not necessarily be a good idea. I accept that there may be some
incompetent implementations out there. But I don't think this means
we should assume that _all_ implementations are incompetent. It does
mean, though, that we can't turn any of these features on by default.
But that's something we know already.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists