linux-ext4 - Re: fsync() errors is unsafe and risks data loss

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20180419141016.GA23437@fieldses.org>
Date:   Thu, 19 Apr 2018 10:10:16 -0400
From:   "J. Bruce Fields" <bfields@...ldses.org>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     Martin Steigerwald <martin@...htvoll.de>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        "Joshua D. Drake" <jd@...mandprompt.com>,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: fsync() errors is unsafe and risks data loss

On Thu, Apr 19, 2018 at 01:39:04AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 18, 2018 at 12:52:19PM -0400, J. Bruce Fields wrote:
> > > Theodore Y. Ts'o - 10.04.18, 20:43:
> > > > First of all, what storage devices will do when they hit an exception
> > > > condition is quite non-deterministic.  For example, the vast majority
> > > > of SSD's are not power fail certified.  What this means is that if
> > > > they suffer a power drop while they are doing a GC, it is quite
> > > > possible for data written six months ago to be lost as a result.  The
> > > > LBA could potentialy be far, far away from any LBA's that were
> > > > recently written, and there could have been multiple CACHE FLUSH
> > > > operations in the since the LBA in question was last written six
> > > > months ago.  No matter; for a consumer-grade SSD, it's possible for
> > > > that LBA to be trashed after an unexpected power drop.
> > 
> > Pointers to documentation or papers or anything?  The only google
> > results I can find for "power fail certified" are your posts.
> > 
> > I've always been confused by SSD power-loss protection, as nobody seems
> > completely clear whether it's a safety or a performance feature.
> 
> Devices from reputable vendors should always be power fail safe, bugs
> notwithstanding.  What power-loss protection in marketing slides usually
> means is that an SSD has a non-volatile write cache.  That is once a
> write is ACKed data is persisted and no additional cache flush needs to
> be sent.  This is a feature only available in expensive eterprise SSDs
> as the required capacitors are expensive.  Cheaper consumer or boot
> driver SSDs have a volatile write cache, that is we need to do a
> separate cache flush to persist data (REQ_OP_FLUSH in Linux).  But
> a reasonable implementation of those still won't corrupt previously
> written data, they will just lose the volatile write cache that hasn't
> been flushed.  Occasional bugs, bad actors or other issues might still
> happen.

Thanks!  That was my understanding too.  But then the name is terrible.
As is all the vendor documentation I can find:

	https://insights.samsung.com/2016/03/22/power-loss-protection-how-ssds-are-protecting-data-integrity-white-paper/

	"Power loss protection is a critical aspect of ensuring data
	integrity, especially in servers or data centers."

	https://www.intel.com/content/.../ssd-320-series-power-loss-data-protection-brief.pdf

	"Data safety features prepare for unexpected power-loss and
	protect system and user data."

Why do they all neglect to mention that their consumer drives are also
perfectly capable of well-defined behavior after power loss, just at the
expense of flush performance?  It's ridiculously confusing.

--b.