lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <528FB6AE.1080405@profihost.ag>
Date:	Fri, 22 Nov 2013 20:55:26 +0100
From:	Stefan Priebe <s.priebe@...fihost.ag>
To:	Chinmay V S <cvs268@...il.com>
CC:	Christoph Hellwig <hch@...radead.org>,
	linux-fsdevel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
	LKML <linux-kernel@...r.kernel.org>, matthew@....cx
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

Am 20.11.2013 16:22, schrieb Chinmay V S:
> Hi Stefan,
>
>> thanks for your great and detailed reply. I'm just wondering why an
>> intel 520 ssd degrades the speed just by 2% in case of O_SYNC. intel 530
>> the newer model and replacement for the 520 degrades speed by 75% like
>> the crucial m4.
>>
>> The Intel DC S3500 instead delivers also nearly 98% of it's performance
>> even under O_SYNC.
>
> If you have confirmed the performance numbers, then it indicates that
> the Intel 530 controller is more advanced and makes better use of the
> internal disk-cache to achieve better performance (as compared to the
> Intel 520). Thus forcing CMD_FLUSH on each IOP (negating the benefits
> of the disk write-cache and not allowing any advanced disk controller
> optimisations) has a more pronouced effect of degrading the
> performance on Intel 530 SSDs. (Someone with some actual info on Intel
> SSDs kindly confirm this.)
>
>>> To simply disable this behaviour and make the SYNC/DSYNC behaviour and
>>> performance on raw block-device I/O resemble the standard filesystem
>>> I/O you may want to apply the following patch to your kernel -
>>> https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba
>>>
>>> The above patch simply disables the CMD_FLUSH command support even on
>>> disks that claim to support it.
>>
>> Is this the right one? By assing ahci_dummy_read_id we disable the
>> CMD_FLUSH?
>>
>> What is the risk of that one?
>
> Yes, https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba is the
> right one. The dummy read_id() provides a hook into the initial
> disk-properties discovery process when the disk is plugged-in. By
> explicitly negating the bits that indicate cache and
> flush-cache(CMD_FLUSH) support, we can ensure that the block driver
> does NOT issue CMD_FLUSH commands to the disk. Note that this does NOT
> disable the write-cache on the disk itself i.e. performance improves
> due to the on-disk write-cache in the absence of any CMD_FLUSH
> commands from the host-PC.

ah OK thanks.

> Theoretically, it increases the chances of data loss i.e. if power is
> removed while the write is in progress from the app. Personally though
> i have found that the impact of this is minimal because SYNC on a raw
> block device with CMD_FLUSH does NOT guarantee atomicity in case of a
> power-loss. Hence, in the event of a power loss, applications cannot
> rely on SYNC(with CMD_FLUSH) for data integrity. Rather they have to
> maintain other data-structures with redundant disk metadata (which is
> precisely what modern file-systems do). Thus, removing CMD_FLUSH
> doesn't really result in a downside as such.

In my production system i've crucial m500 which have a capicitor so in a 
case of power loss they flush their data to disk automatically.

> The main thing to consider when applying the above simple patch is
> that it is system-wide. The above patch prevents the host-PC from
> issuing CMD_FLUSH for ALL drives enumerated via SATA/SCSI on the
> system.
>
> If this patch works for you, then to restrict the change in behaviour
> to a specific disk, you will need to:
> 1. Identify the disk by its model number within the dummy read_id().
> 2. Zero the bits ONLY for your particular disk.
> 3. Return without modifying anything for all other disks.
>
> Try out the above patch and let me know if you have any further issues.

The best thing would be a a flag under
/sys/bock/sdc/device/

for ssds with capictor - so everybody can decide on their own.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ