[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080815120235.GJ13048@mit.edu>
Date: Fri, 15 Aug 2008 08:02:35 -0400
From: Theodore Tso <tytso@....edu>
To: David Woodhouse <dwmw2@...radead.org>
Cc: linux-ext4@...r.kernel.org
Subject: Re: [EXT2] Discard unused sectors
On Thu, Aug 14, 2008 at 10:05:48AM +0100, David Woodhouse wrote:
> I'm not sure how to do this for ext[34]. The sb_issue_discard() function
> issues its requests as a soft barrier, because for naïve callers it
> needs to ensure that the discard happens _before_ any subsequent writes
> to the same sectors (if they get reallocated immediately).
>
> But ext[34] can probably do better than that, and submit the discard
> requests _without_ barriers of their own. If someone with a bit more
> clue does it, that is.
It's worse than this. We can't call sb_issue_discard() until the
transaction commits, since if we crash before the commit, the undelete
will not have happened. (The block/inode bitmaps, inode table,
et. al., aren't allowed to go out to disk until the transaction
commit, and similarly, those sectors aren't allowed to get reused
until the commit happens, as well.)
This is going to be true of any filesystem which is doing journaling.
What makes life a bit more difficult for ext4 is that we are doing
physical block journaling, so we're not keeping track which blocks are
getting discarded. (In contrast, systems that do logical journaling
are keeping track of specific lists of blocks that are getting freed,
since that's what they write to the journal.) This means we'll have
to keep our own in-memory list of extents for which we should call
sb_issue_discard() when the transaction finally commits. So this is
something that we would have to track in the jbd/jbd2 layer, hanging
off of the transaction structure. If we do this right, it will also
be what OCFS2 can use too (since it uses the jbd layer as well.)
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists