linux-ext4 - Re: Some interesting input from a flash manufacturer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120302231131.GE22215@thunk.org>
Date:	Fri, 2 Mar 2012 18:11:31 -0500
From:	Ted Ts'o <tytso@....edu>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	linux-ext4@...r.kernel.org, Lukas Czerner <lczerner@...hat.com>
Subject: Re: Some interesting input from a flash manufacturer

On Fri, Mar 02, 2012 at 03:04:48PM -0600, Eric Sandeen wrote:
> > One is that he would actually be very happy if we send lots of extra
> > trim commands; in particular, he would actually *like* us to send trims
> > at unlink/commit time, *and* trims periodically via FITRIM.  The reason
> > for that is because that way, if the disk is busy, it would be OK if he
> > dropped the TRIM on the floor, knowing that he would get another bite at
> > the apple later on.  But, if the disk has time to process the trim, he
> > he would be able to use that information as quickly as possible.
> 
> Is that within spec?

Yup; the drive manufacturer is free to do anything they want with the
TRIM command; it's purely advisory.  So dropping it on the floor if
you're too busy because some other process is sending random 4k writes
to you at a high rate, is something that's within spec.

Or if the thin provisioning service is only tracking blocks with a
granularity of 4megs, and it receives trim request for less than 4
megabytes, it again is perfectly free to drop the trim request on the
floor.  I'm even aware of one implementation which remembers the trim
request while the system is powered on, but since it doesn't
(necessarily) write the trim information to stable store, you could
trim the block, read the block and get zeros, then take a power
failure, and afterwards, read the block and get the previous contents.

As far as I know, the Trim spec allows all of this.

> > We also talked about ways that we might right some application notes so
> > that handset OEM's understood how to use mke2fs parameters to optimize
> > their file systems for different types of flash systems, and perhaps
> > ways that the eMMC spec could be enhanced so that key parameters such as
> > erase block size, flash page size, and translation table granularity
> > could be passed back to the block layer, and made available to file
> > system and mkfs.
> 
> Now that would be nice.  Could some of this just be piggybacked on the
> existing preferred_io_size-type geometry interfaces?  

As far as the /sys/block/XXX/queue/* framework, certainly.  It's not
clear, however, whether or not we should use entirely new parameters,
or try to reuse the existing parameters.  For example, would it be
better to use optimal_io_size for the flash page size, or the erase
block size?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html