lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Nov 2013 02:11:01 -0800
From:	Christoph Hellwig <hch@...radead.org>
To:	Chinmay V S <cvs268@...il.com>
Cc:	"J. Bruce Fields" <bfields@...ldses.org>,
	Theodore Ts'o <tytso@....edu>,
	Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
	linux-fsdevel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
	LKML <linux-kernel@...r.kernel.org>,
	Matthew Wilcox <matthew@....cx>
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

> 
> 1. Most drives do NOT respond to CMD_FLUSH immediately i.e. they wait
> until the data is actually moved to the non-volatile media (which is
> the right behaviour) i.e. performance drops.

Which is what the specification sais they must do.

> 2. Some drives may implement CMD_FLUSH to return immediately i.e. no
> guarantee the data is actually on disk.

In which case they aren't spec complicant.  While I've seen countless
data integrity bugs on lower end ATA SSDs I've not seen one that simpliy
ingnores flush.  If you'd want to cheat that bluntly you'd be better
of just claiming to not have a writeback cache.

> 3. Anyway, CMD_FLUSH does NOT guarantee atomicity. (Consider power
> failure in the middle of an ongoing CMD_FLUSH on non battery-backed
> disks).

It does not guarantee atomicy by itself, but it's the only low-level
primitive a filesystem or database can use build atomic transaction
at a higher level on an ATA disk with the writeback cache enabled.

> In case the application cannot be modified to perform ASYNC IO, there
> exists a way to disable the behaviour of issuing a CMD_FLUSH for each
> sync() within the block device driver for SATA/SCSI disks. This is
> what is described by
> https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba

Which is utterly broken, and your insistance on pushing it shows you
do not understand the problem space.

You solve your performance problem by completely disabling any chance
of having data integrity guarantees, and do so in a way that is not
detectable for applications or users.

If you have a workload with lots of small synchronous writes disabling
the writeback cache on the disk does indeed often help, especially with
the non-queueable FLUSH on all but the most recent ATA devices.

If you do have workloads where you do lots of synchronous writes

> Just to be clear, i am NOT recommending that this change be mainlined;
> rather it is a reference to improve performance in the rare cases(like
> in the OP Stefan's case) where both the app performing DIRECT SYNC
> block IO and the disk firmware implementing CMD_FLUSH can NOT be
> modified. In which case the standard block driver behaviour of issuing
> a CMD_FLUSH with each write is too restrictive and thus modified using
> the patch.

Again, what your patch does is to explicitly ignore the data integrity
request from the application.  While this will usually be way faster,
it will also cause data loss.  Simply disabling the writeback cache
feature of the disk using hdparm will give you much better performance
than issueing all the FLUSH command, especially if they are non-queued,
but without breaking the gurantee to the application.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ