linux-kernel - Re: Discard support (was Re: [PATCH] swap: send callback when swap slot is freed)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A85DF1E.3050801@rtr.ca>
Date:	Fri, 14 Aug 2009 18:03:10 -0400
From:	Mark Lord <liml@....ca>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
Cc:	Greg Freemyer <greg.freemyer@...il.com>, david@...g.hm,
	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Nitin Gupta <ngupta@...are.org>, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-scsi@...r.kernel.org, linux-ide@...r.kernel.org,
	Linux RAID <linux-raid@...r.kernel.org>
Subject: Re: Discard support (was Re: [PATCH] swap: send callback when swap
  slot is freed)

James Bottomley wrote:
> On Thu, 2009-08-13 at 14:15 -0400, Greg Freemyer wrote:
>> On Thu, Aug 13, 2009 at 12:33 PM, <david@...g.hm> wrote:
>>> On Thu, 13 Aug 2009, Markus Trippelsdorf wrote:
>>>
>>>> On Thu, Aug 13, 2009 at 08:13:12AM -0700, Matthew Wilcox wrote:
>>>>> I am planning a complete overhaul of the discard work.  Users can send
>>>>> down discard requests as frequently as they like.  The block layer will
>>>>> cache them, and invalidate them if writes come through.  Periodically,
>>>>> the block layer will send down a TRIM or an UNMAP (depending on the
>>>>> underlying device) and get rid of the blocks that have remained unwanted
>>>>> in the interim.
>>>> That is a very good idea. I've tested your original TRIM implementation on
>>>> my Vertex yesterday and it was awful ;-). The SSD needs hundreds of
>>>> milliseconds to digest a single TRIM command. And since your
>>>> implementation
>>>> sends a TRIM for each extent of each deleted file, the whole system is
>>>> unusable after a short while.
>>>> An optimal solution would be to consolidate the discard requests, bundle
>>>> them and send them to the drive as infrequent as possible.
>>> or queue them up and send them when the drive is idle (you would need to
>>> keep track to make sure the space isn't re-used)
>>>
>>> as an example, if you would consider spinning down a drive you don't hurt
>>> performance by sending accumulated trim commands.
>>>
>>> David Lang
>> An alternate approach is the block layer maintain its own bitmap of
>> used unused sectors / blocks. Unmap commands from the filesystem just
>> cause the bitmap to be updated.  No other effect.
>>
>> (Big unknown: Where will the bitmap live between reboots?  Require DM
>> volumes so we can have a dedicated bitmap volume in the mix to store
>> the bitmap to? Maybe on mount, the filesystem has to be scanned to
>> initially populate the bitmap?   Other options?)
> 
> I wouldn't really have it live anywhere.  Discard is best effort; it's
> not required for fs integrity.  As long as we don't discard an in-use
> block we're free to do anything else (including forget to discard,
> rediscard a discarded block etc).
> 
> It is theoretically possible to run all of this from user space using
> the fs mappings, a bit like a defrag command.
..

Already a work-in-progress -- see my wiper.sh script on the hdparm page
at sourceforge.  Trimming 50+GB of free space on a 120GB Vertex
(over 100 million sectors) takes a *single* TRIM command,
and completes in only a couple of seconds.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/