lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 05 Nov 2014 16:27:52 +0200
From:	Artem Bityutskiy <dedekind1@...il.com>
To:	Tanya Brokhman <tlinder@...eaurora.org>
Cc:	richard@....at, linux-mtd@...ts.infradead.org,
	linux-arm-msm@...r.kernel.org,
	Randy Dunlap <rdunlap@...radead.org>,
	David Woodhouse <dwmw2@...radead.org>,
	Brian Norris <computersforpeace@...il.com>,
	"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC/PATCH 1/5 v2] mtd: ubi: Read disturb infrastructure

Hi,

let me summarize.

1. To handle the read disturb problem you selected the read counters
solution. PEB (physical erase block) read counter is the trigger for
scrubbing.

When the PEB read counter reaches a threshold value - we scrub.
The threshold value is set to 100000 by default. Users may it via sysfs.

Read-counters are stored on the flash media, in the fastmap structure.


2. To handle the data retention problem you selected the time-stamps
approach. Each PEB gets a time-stamp. Time-stamp gets updated when the
PEB is erased.

When PEB becomes old enough (by comparing to the current system time),
it gets scrubbed.

The threshold for "old enough" is 120 days by default. Users may change
it via sysfs.

Timestamps are stored on the flash media, in the erase counter header of
the PEB.


For both, the read-disturb and data retention problems to be taken care
of, user-space has to periodically trigger scanning, which will go
through all PEBs, check the read counter and the timestamp, and scrub if
needed.


Hopefully I got right. I have concerns.


This is rather complex design, and it is not clear why just doing full
flash read from time to time is not good enough. Why the complexity is
worth it.

In this case the trigger for scrubbing is a bit-flip. It indicates that
there is a problem, and this is a reliable trigger.

Is the 100000 reads threshold a reliable trigger? What if it is 10 times
larger for me, or smaller?

Is the 120 days threshold a reliable trigger? What if it is much larger
for me?

Do you think it is even possible to get the thresholds right?

Let me re-emphasize: bit-flip is an objective, reliable reason to scrub.
Thresholds are more of a pessimistic prediction. No?

Let's think of someone having a R/O storage with video files. Compare:

1. Do full media read every 120 days in background with low priority
(possibly implemented as UBI service). And because it is R/O, cut the
power all you want.

2. Have UBI write data to media all the time to maintain the read
counters. Lose your fastmap and have slow mounts on power cuts. Have
extra work on boot-time. Spend more RAM.

Why 2 is better than 1?

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ