linux-kernel - Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131120175807.GC5380@fieldses.org>
Date:	Wed, 20 Nov 2013 12:58:07 -0500
From:	"J. Bruce Fields" <bfields@...ldses.org>
To:	Chinmay V S <cvs268@...il.com>
Cc:	Theodore Ts'o <tytso@....edu>,
	Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
	Christoph Hellwig <hch@...radead.org>,
	linux-fsdevel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
	LKML <linux-kernel@...r.kernel.org>,
	Matthew Wilcox <matthew@....cx>
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

On Wed, Nov 20, 2013 at 10:41:54PM +0530, Chinmay V S wrote:
> On Wed, Nov 20, 2013 at 9:25 PM, J. Bruce Fields <bfields@...ldses.org> wrote:
> > Some SSD's are also claim the ability to flush the cache on power loss:
> >
> >         http://www.intel.com/content/www/us/en/solid-state-drives/ssd-320-series-power-loss-data-protection-brief.html
> >
> > Which should in theory let them respond immediately to flush requests,
> > right?  Except they only seem to advertise it as a safety (rather than a
> > performance) feature, so I probably misunderstand something.
> >
> > And the 520 doesn't claim this feature (look for "enhanced power loss
> > protection" at http://ark.intel.com/products/66248), so that wouldn't
> > explain these results anyway.
> 
> FYI, nowhere does Intel imply that the CMD_FLUSH is instantaneous. The
> product brief for Intel 320 SSDs (above link), explains that it is
> implemented by a power-fail detection circuit that detects drop in
> power-supply, following which the on-disk controller issues an internal
> CMD_FLUSH equivalent command to ensure data is moved to the
> non-volatile area from the disk-cache. Large secondary capacitors
> ensure backup supply for this brief duration.
> 
> Thus applications can always perform asynchronous I/O upon the disk,
> taking comfort in the fact that the physical disk ensures that all
> data in the volatile disk-cache is automatically transferred to the
> non-volatile area even in the event of an external power-failure. Thus
> the host never has to worry about issuing a CMD_FLUSH (which is still
> a terribly expensive performance bottleneck, even on the Intel 320
> SSDs).

So why is it up to the application to do this and not the drive?
Naively I'd've thought it would be simpler if the protocol allowed the
drive to respond instantly if it knows it can do so safely, and then you
could always issue flush requests, and save some poor admin from having
to read spec sheets to figure out if they can safely mount "nobarrier".

Is it that you want to eliminate CMD_FLUSH entirely because the protocol
still has some significant overhead even if the drive responds to it
quickly?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/