[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080930141559.GO10831@mit.edu>
Date: Tue, 30 Sep 2008 10:15:59 -0400
From: Theodore Tso <tytso@....edu>
To: Alex Tomas <bzzz@....com>
Cc: Andreas Dilger <adilger@....com>, linux-ext4@...r.kernel.org
Subject: Re: Potential bug in mballoc --- reusing data blocks before txn
commit
On Tue, Sep 30, 2008 at 05:12:12PM +0400, Alex Tomas wrote:
>> For ext4, the only reason to use a tree would be to allow us to merge
>> deleted extents. This might not be worth the complexity, though, I
>> admit it.
>
> strictly speaking, extents code should have merged them at allocation time.
Sorry, I wasn't being clear enough. I was thinking of the scenario
where the user runs "rm -r" and deletes a directory hierarchy with
lots of small files. So the merging I was talking about was between
blocks belonging to different files, so we can send a single large
"trim" command to the disk. And since we can delete a large number of
files in 5 seconds with "rm -r", and the blocks will likely be very
close together if the allocator is doing a good job and the filesystem
is relatively unfragmented, it would also save memory if we can merge
extents belonging to different files instead of keeping them
separately on the linked list.
> oops. I meant in-core bitmap mballoc generates. if there is intention
> to get rid of old allocator (balloc.c), then we don't need b_committed_data.
Yes, I sent a patch on Sunday night proposing to do exactly that, as a
way of simplifying the code and reducing the test matrix for ext4.
> btw, I've just remembered why I decided don't protect data from reallocation:
> in data=writeback one can get block with stale data easily. and many people
> (to my knowledge) were using data=writeback as performing better.
Well, data=ordered is the default, so there would be many more people
using data=ordered. If we think there is a significant advantage in
not protecting data from reallocation besides the memory utilization,
I suppose we could make protecting data being conditional on
data=writeback. Perhaps having the additional data blocks available
to the block allocator could allow it to make better decisions. Not
sure it's worth it, though. Any thoughts?
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists