lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq11tz1p5o9.fsf@sermon.lab.mkp.net>
Date:	Mon, 17 Feb 2014 20:31:50 -0500
From:	"Martin K. Petersen" <martin.petersen@...cle.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	"Martin K. Petersen" <martin.petersen@...cle.com>,
	linux-fsdevel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
	linux-ext4@...r.kernel.org
Subject: Re: [PATCH RFC] block: use discard if possible in blkdev_issue_discard()

>>>>> "Ted" == Theodore Ts'o <tytso@....edu> writes:

Ted,

Ted> So currently blkdev_issue_zeroout() will do the WRITE SAME, but it
Ted> doesn't set the UNMAP bit, correct?  

Correct. Because it explicitly tells the device to write zeroes.

Ted> I understand there will be environments where performance is more
Ted> important than cost, where it may not be a good idea to set the
Ted> UNMAP bit.  So it sounds like what we should do is add a flags
Ted> which controls whether or not to use TRIM w/ ZRAT or WRITE SAME
Ted> with the UNMAP bit is set.

The rationale behind blkdev_issue_discard was to provide a facility to
mark a block range as unused by the filesystem. With the expectation
that those blocks would be deallocated/deprovisioned on the device.

The rationale behind blkdev_issue_zeroout was to provide a facility to
provide a cleared block range. With the expectation that those blocks
would be allocated/provisioned on the device.

Your variant seems to land somewhere in-between. You want a
blkdev_issue_clear() that zeroes a block range and it's then up to the
storage device to decide whether to provision or deprovision the space
as long as you are guaranteed to get zeroes back for each block in the
range on a subsequent read. Is that a correct interpretation?

I'm trying to pin down your exact use case because it can get very murky
between the SCSI and ATA variants. And the fact that the same knobs are
used for both over and under-provisioned devices. SCSI also has an
additional state: Blocks can be either mapped, anchored or deallocated.

Ted> On the other hand, if there was a white list kept somewhere, either
Ted> in the kernel, or in some more dynamically updated list (ala what
Ted> smartctl does to get the latest vendor-specific attributes), being
Ted> on the white list might be enough of a commercial advantage that
Ted> drive vendors would be incentivized to provide such a guarantee.
Ted> Especially if, say, a major SSD vendor such as Intel could be
Ted> induced make such a public guarantee and we publicized this fact.

I'm perfectly fine with maintaining a whitelist if we can get vendors to
commit.

-- 
Martin K. Petersen	Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ