lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq1eiaiw0a7.fsf@sermon.lab.mkp.net>
Date:	Thu, 18 Nov 2010 20:49:04 -0500
From:	"Martin K. Petersen" <martin.petersen@...cle.com>
To:	Mark Lord <kernel@...savvy.com>
Cc:	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Greg Freemyer <greg.freemyer@...il.com>,
	James Bottomley <James.Bottomley@...e.de>,
	Jeff Moyer <jmoyer@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Matthew Wilcox <matthew@....cx>,
	Josef Bacik <josef@...hat.com>,
	Lukas Czerner <lczerner@...hat.com>, tytso@....edu,
	linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, sandeen@...hat.com
Subject: Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation

>>>>> "Mark" == Mark Lord <kernel@...savvy.com> writes:

Mark> Surely if a userspace tool and shell-script can accomplish this,
Mark> totally lacking real filesystem knowledge, then we should be able
Mark> to approximate it in kernel space?

It's the splitting and merging on stacked devices that's the hard
part. Something wiper.sh does not have to deal with. And thanks to
differences in the protocols the SCSI-ATA translation isn't a perfect
fit.

Every time TRIM comes up the discussion turns into how much we suck at
it because we don't support coalescing of discontiguous ranges.

However, we *do* support discarding contiguous ranges of up to about 2GB
per command on ATA. It's not like we're issuing a TRIM command for every
sector.

For offline/weekly reclaim/FITRIM we have the full picture when the
discard is issued. And thus we have the luxury of being able to send out
relatively big contiguous discards unless the filesystem is insanely
fragmented.

For runtime discard usage we'll inevitably be issuing lots of itty-bitty
512 or 4KB single-command discards. That's going to suck for performance
on your average ATA SSD. Doctor, it hurts when I do this...

So assuming we walk the filesystem to reclaim space on ATA SSDs on a
weekly basis (since that's the only sane approach): 

       What is the performance impact of not coalescing discontiguous
       block ranges when cron scrubs your /home at 4am Sunday morning?

That, to me, is the important question. That obviously depends on the
SSD, filesystem, fragmentation and so on. Is the win really big enough
to justify a bunch of highly intrusive changes to our I/O stack?

Thanks to PCIe SSDs and other upcoming I/O technologies we're working
hard to bring request latency down by simplifying things. Adding
complexity seems like a bad idea at this time. And that was the
rationale behind the consensus at the filesystem workshop.

-- 
Martin K. Petersen	Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ