lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101122202002.GA2767@thunk.org>
Date:	Mon, 22 Nov 2010 15:20:02 -0500
From:	Ted Ts'o <tytso@....edu>
To:	Lukas Czerner <lczerner@...hat.com>
Cc:	linux-ext4@...r.kernel.org, sandeen@...hat.com, adilger@...ger.ca
Subject: Re: [PATCH] mke2fs: Inform user about ongoing discard

On Thu, Oct 21, 2010 at 04:23:02PM +0200, Lukas Czerner wrote:
> Since there are some slow SSD's out there and big thinly provisioned
> storages on which it takes quite long to issue discard through whole
> device, it would be nice to provide user the information about what is
> going on and how long it will take (approximately).
> 
> Signed-off-by: Lukas Czerner <lczerner@...hat.com>

Hi Lukas,

I've looked at this patch, and one thing that disturbs me about it ---
you are discarding the first percentage of the disk five percent times
for no good reason just to get the timing, before then executing the
discard for the entire disk.   There are a couple of problems with this:

*) For smart/competently implemented SSD's, discarding the same part
of the disk five times might lead to a misleading timing --- the smart
device could easily determine that the first 1% is already not in use
after the first discard, and the subsequent 4 discards could be
discard as no-ops.

*) Mark Lord has claimed that there exists a large number of
incomptently implemented SSD's out there, that may actually be
executing a flash erase of the discarded region.  If true, executing
an extra flash erase on 1% of the disk for no good reason five times
might not be the best thing to do for the longetivity of the device.

I was tempted to fix this up myself, but since I'm trying to get
better at delegating work to others, may I suggest the following
changes?

1)  Implement block device ioctl's for the kernel that export the
discard_granularity, discard_alignment, and max_discard_sectors.  

2) Change mke2fs so that the discard is done in a separate function.
Said function should attempt to fetch the discard_granularity,
discard_alignment, and max_discard_sectors.

3) This new function in mke2fs should start by discarding
approximately 1% of device at a time, respecting discard_granularity
and discard_alignment.  If the time to discard 1% of the device is
less than a second, then it should double the amount that it discards
at a time.  If the time to discard takes longer than 4 seconds, it
should reduce the amount that it discards by half (again, always
respecting discard_granularity and discard_alignment).  The function
can display the amount of time elapsed and the estimated amount of
time remaining after each chunk of the device that it discards,
assuming it can use ^M to redraw the progress report (which of course
should be suppressed if the -q option is specified on the command
line).


This design doesn't "waste" any discards, which is both faster and
reduces wear on badly designed SSD's.  It also continuously updates
the user with the amount of time it takes to complete the discard
process.  It also will respect the discard_granularity and
discard_alignment restrictions; and of course, it allows the user to
interrupt the discard, without needing a special kernel patch.

Does this make sense to you?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ