[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87f94c370908141554ia447f5fo87c74d5d8c517c1c@mail.gmail.com>
Date: Fri, 14 Aug 2009 18:54:11 -0400
From: Greg Freemyer <greg.freemyer@...il.com>
To: Mark Lord <liml@....ca>
Cc: James Bottomley <James.Bottomley@...senpartnership.com>,
david@...g.hm, Markus Trippelsdorf <markus@...ppelsdorf.de>,
Matthew Wilcox <willy@...ux.intel.com>,
Hugh Dickins <hugh.dickins@...cali.co.uk>,
Nitin Gupta <ngupta@...are.org>, Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-scsi@...r.kernel.org, linux-ide@...r.kernel.org,
Linux RAID <linux-raid@...r.kernel.org>
Subject: Re: Discard support (was Re: [PATCH] swap: send callback when swap
slot is freed)
On Fri, Aug 14, 2009 at 6:03 PM, Mark Lord<liml@....ca> wrote:
> James Bottomley wrote:
>>
>> On Thu, 2009-08-13 at 14:15 -0400, Greg Freemyer wrote:
>>>
>>> On Thu, Aug 13, 2009 at 12:33 PM, <david@...g.hm> wrote:
>>>>
>>>> On Thu, 13 Aug 2009, Markus Trippelsdorf wrote:
>>>>
>>>>> On Thu, Aug 13, 2009 at 08:13:12AM -0700, Matthew Wilcox wrote:
>>>>>>
>>>>>> I am planning a complete overhaul of the discard work. Users can send
>>>>>> down discard requests as frequently as they like. The block layer
>>>>>> will
>>>>>> cache them, and invalidate them if writes come through. Periodically,
>>>>>> the block layer will send down a TRIM or an UNMAP (depending on the
>>>>>> underlying device) and get rid of the blocks that have remained
>>>>>> unwanted
>>>>>> in the interim.
>>>>>
>>>>> That is a very good idea. I've tested your original TRIM implementation
>>>>> on
>>>>> my Vertex yesterday and it was awful ;-). The SSD needs hundreds of
>>>>> milliseconds to digest a single TRIM command. And since your
>>>>> implementation
>>>>> sends a TRIM for each extent of each deleted file, the whole system is
>>>>> unusable after a short while.
>>>>> An optimal solution would be to consolidate the discard requests,
>>>>> bundle
>>>>> them and send them to the drive as infrequent as possible.
>>>>
>>>> or queue them up and send them when the drive is idle (you would need to
>>>> keep track to make sure the space isn't re-used)
>>>>
>>>> as an example, if you would consider spinning down a drive you don't
>>>> hurt
>>>> performance by sending accumulated trim commands.
>>>>
>>>> David Lang
>>>
>>> An alternate approach is the block layer maintain its own bitmap of
>>> used unused sectors / blocks. Unmap commands from the filesystem just
>>> cause the bitmap to be updated. No other effect.
>>>
>>> (Big unknown: Where will the bitmap live between reboots? Require DM
>>> volumes so we can have a dedicated bitmap volume in the mix to store
>>> the bitmap to? Maybe on mount, the filesystem has to be scanned to
>>> initially populate the bitmap? Other options?)
>>
>> I wouldn't really have it live anywhere. Discard is best effort; it's
>> not required for fs integrity. As long as we don't discard an in-use
>> block we're free to do anything else (including forget to discard,
>> rediscard a discarded block etc).
>>
>> It is theoretically possible to run all of this from user space using
>> the fs mappings, a bit like a defrag command.
>
> ..
>
> Already a work-in-progress -- see my wiper.sh script on the hdparm page
> at sourceforge. Trimming 50+GB of free space on a 120GB Vertex
> (over 100 million sectors) takes a *single* TRIM command,
> and completes in only a couple of seconds.
>
> Cheers
>
Mark,
What filesystems does your script support? Running a tool like this
in the middle of the night makes a lot of since to me even from the
perspective of many / most enterprise users.
How do prevent a race where a block becomes used between userspace
asking status and it sending the discard request?
ps: I tried to pull wiper.sh straight from sourceforge, but I'm
getting some crazy page asking all sorts of questions and not letting
me bypass it. I hope sourceforge is broken. The other option is they
meant to do this. :(
Greg
--
Greg Freemyer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists