lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2edcf8e344937b3c5b92a0b87ebd13bd@walle.cc>
Date:   Mon, 07 Dec 2020 21:39:32 +0100
From:   Michael Walle <michael@...le.cc>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     linux-ext4@...r.kernel.org, linux-mmc@...r.kernel.org,
        linux-block@...r.kernel.org
Subject: Re: discard feature, mkfs.ext4 and mmc default fallback to normal
 erase op

Hi Ted,

Am 2020-12-07 19:35, schrieb Theodore Y. Ts'o:
> On Mon, Dec 07, 2020 at 04:10:27PM +0100, Michael Walle wrote:
>> Hi,
>> 
>> The problem I'm having is that I'm trying to install debian on
>> an embedded system onto an sdcard. During installation it will
>> format the target filesystem, but the "mkfs.ext4 -F /dev/mmcblk0p2"
>> takes ages.
>> 
>> What I've found out so far:
>>  - mkfs.ext4 tries to discard all blocks on the target device
>>  - with my target device being an sdcard it seems to fallback
>>    to normal erase [1], with erase_arg being set to what the card
>>    is capable of [2]
>> 
>> Now I'm trying to figure out if this behavior is intended. I guess
>> one can reduce it to "blkdiscard /dev/mmcblk0p2". Should this
>> actually fall back to normal erasing or should it return -EOPNOTSUPP?
> 
> There are three different MMC commands which are defined:
> 
> 1) DISCARD
> 2) ERASE
> 3) SECURE ERASE
> 
> The first two are expected to be fast, since it only involves clearing
> some metadata fields in the Flash Translation Layer (FTL), so that the
> LBA's in the specified range are no longer mapped to a flash page.

Mh, where is it specified that the erase command is fast? According
to the Physical Layer Simplified Specification Version 8.00:

  The actual erase time may be quite long, and the host may issue CMD7
  to deselect the card or perform card disconnection, as described in
  the Block Write section, above.

Honest question. Also reading "4.14 Erase Timeout Calculation" doesn't
sound that it is fast.

Also there is this comment:
https://elixir.bootlin.com/linux/v5.9.12/source/drivers/mmc/core/core.c#L1495

> The difference between "discard" and "erase" is that "discard" is a
> hint, so the device is allowed to ignore it whenever it wants (in
> practice, if it's busy doing a GC, or if it's busy writing back blocks
> in its writeback cache).  "Erase" is guaranteed to work, in that after
> an erase, a read from a specified sector MUST return all zeros, but
> that can easily be done by redirecting a point in the FTL metadata.
> 
> "Secure Erase" is the one which can be slow, since it requires
> physically zeroing all of the flash pages (although if the device is
> self-encrypting, this in theory could also be fast if you're doing a
> secure erase at the granularity of the device's encryption keys, so
> all it needs to do is to regenerate the crypto key).
> 
> It sounds like your SD card is implementing the "erase" command in a
> particularly non-optimal way.  If it's common, perhaps we need some
> kind of blacklist for drivers with badly implemented erase commands.
> As a workaround, you can run mke2fs with the command-line option "-E
> discard=0".

I've already tested that "mkfs.ext4 -E nodiscard" is fast (or works in
the same way as before the pre-discard feature).

But I wouldn't say it is a cheapo card (Toshiba Exceria). Although I
cannot guarantee that it might be a china clone, but it looks authentic
;)


> P.S.  If your SD card got "erase" wrong, I'd be a little worried about
> what else the FTL implementation may have screwed up.  So you want to
> under simply getting a different SD card --- especially if this is
> something that you plan to distribute as a product to downstream
> customers.  In general, low-end flash needs to be very carefully
> qualified to make sure they are competently implemented if you plan to
> deploy in large quantities.  An example of what happen if this
> qualification process is not done:
> 
> https://insideevs.com/news/376037/tesla-mcu-emmc-memory-issue/
> 
> Tesla is currently under investigation by the National Highway Traffic
> Safety Administration due to cheaping out on their eMMC flash
> (probably just a few pennies per unit).  Given that customers are
> having to pay $1500 to replace their engine controller out of warranty
> (and the NHTSA is considering whether or not to force Tesla to eat the
> costs, as opposed to forcing their customers to pay $$$), that's an
> example of false economy....

Yeah I'm aware of the Tesla eMMC wear-out problem. But I've seen this
esp. from a user point of view. Like take our product, where the user
can freely choose its sdcard just to then notice that the installation
of its distribution is painfully slow. So I'm interested in 
understanding
the implications. Like is it really the case that the erase command can
be assumed fast.

-michael

Powered by blists - more mailing lists