[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <54658239-0690-45BB-9CBD-5B1975E4DEF0@dilger.ca>
Date: Thu, 5 Aug 2010 09:47:16 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: Ted Ts'o <tytso@....edu>
Cc: Greg Freemyer <greg.freemyer@...il.com>,
Lukas Czerner <lczerner@...hat.com>,
Dmitry Monakhov <dmonakhov@...nvz.org>,
linux-ext4@...r.kernel.org, jmoyer@...hat.com, rwheeler@...hat.com,
eshishki@...hat.com, sandeen@...hat.com, jack@...e.cz,
Mark Lord <kernel@...savvy.com>
Subject: Re: [PATCH 1/3] Add ioctl FITRIM.
On 2010-08-04, at 18:28, Ted Ts'o wrote:
> On Wed, Aug 04, 2010 at 11:26:56AM -0400, Greg Freemyer wrote:
>> If true, a way to control the progress from userspace is important.
>>
>> If in general it is only going to take a few seconds for a full FITRIM
>> to run, it is much less important, but I suppose the the RT project
>> might find even that problematic.
>
> Even if it without the RT project, if disk activity is slowed or
> completely stopped for a few seconds, I can think of plenty of
> workloads where this would be totally unacceptable. Suppose you are
> running a web site; it doesn't really matter whether it is at Google,
> Facebook, Twitter, etc. If this means that one or more web pages get
> stalled by "a few seconds" while the FITRIM is going on, this is
> generally not considered acceptable. Even if it slows down the server
> by 30-50%, for some sites this would also be quite unacceptable.
>
> This is a hard problem to solve, though, especially if there is an
> insistence to solve it in a fs-independent fashion. I could imagine
> doing this at work, by doing things one block group at a time, and
> then I could measure, for our specific hardware, how badly disk
> performance would get hit, and for how long, and then the userspace
> daemon could control how many block groups to do per unit time.
> But this would be of necessity ext2/3/4 specific....
I think "blockgroup at a time" is simply the extN way of "range of blocks at a time". Having an API that is requesting "trim free blocks from [M,N]" is a generic enough interface to apply to any filesystem. If there is some way to query the "efficient trim increment size" (i.e. block group for extN, allocation group for xfs, ??? for btrfs) then userspace could do it that way, or simply pick some fraction of the filesystem and use a nice power-of-two value and hope it works out.
> So I'm not sure what to suggest here. Maybe the answer is we can have
> a fs-independent ioctl for desktop workloads, and one which gives more
> fine-grained control for those who need it? That seems ugly, but it
> might be the best compromise.
No, too ugly.
Cheers, Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists