linux-kernel - Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131124002236.GA10600@amd.pavel.ucw.cz>
Date:	Sun, 24 Nov 2013 01:22:36 +0100
From:	Pavel Machek <pavel@....cz>
To:	Ric Wheeler <ricwheeler@...il.com>
Cc:	Howard Chu <hyc@...as.com>, Theodore Ts'o <tytso@....edu>,
	Chinmay V S <cvs268@...il.com>,
	Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
	Christoph Hellwig <hch@...radead.org>,
	linux-fsdevel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
	LKML <linux-kernel@...r.kernel.org>, matthew@....cx
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

On Sat 2013-11-23 18:01:32, Ric Wheeler wrote:
> On 11/23/2013 03:36 PM, Pavel Machek wrote:
> >On Wed 2013-11-20 08:02:33, Howard Chu wrote:
> >>Theodore Ts'o wrote:
> >>>Historically, Intel has been really good about avoiding this, but
> >>>since they've moved to using 3rd party flash controllers, I now advise
> >>>everyone who plans to use any flash storage, regardless of the
> >>>manufacturer, to do their own explicit power fail testing (hitting the
> >>>reset button is not good enough, you need to kick the power plug out
> >>>of the wall, or better yet, use a network controlled power switch you
> >>>so you can repeat the power fail test dozens or hundreds of times for
> >>>your qualification run) before being using flash storage in a mission
> >>>critical situation where you care about data integrity after a power
> >>>fail event.
> >>Speaking of which, what would you use to automate this sort of test?
> >>I'm thinking an SSD connected by eSATA, with an external power
> >>supply, and the host running inside a VM. Drop power to the drive at
> >>the same time as doing a kill -9 on the VM, then you can resume the
> >>VM pretty quickly instead of waiting for a full reboot sequence.
> >I was just pulling power on sata drive.
> >
> >It uncovered "interesting" stuff. I plugged power back, and kernel
> >re-estabilished communication with that drive, but any settings with
> >hdparm were forgotten. I'd say there's some room for improvement
> >there...
> 
> Hi Pavel,
> 
> When you drop power, your drive normally loses temporary settings
> (like a change to write cache, etc).
> 
> Depending on the class of the device, there are ways to make that
> permanent (look at hdparm or sdparm for details).
> 
> This is a feature of the drive and its firmware, not something we
> reset in the device each time it re-appears.

Yes, and I'm arguing that is a bug (as in, < 0.01% people are using
hdparm correctly).

So you used hparm to disable write cache so that ext3 can be safely
used on your hdd. Now you have glitch on power. Then, system continues
to operate in dangerous mode until reboot.

I guess it would be safer not to reattach drives after power
fail... (also I wonder what this does to data integrity. Drive lost
content of its writeback cache, but kernel continues... Journal will
not prevent data corruption in this case).

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/