[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-id: <20081001071703.GH3160@webber.adilger.int>
Date: Wed, 01 Oct 2008 00:17:03 -0700
From: Andreas Dilger <adilger@....com>
To: Theodore Tso <tytso@....edu>
Cc: Alex Tomas <bzzz@....com>, linux-ext4@...r.kernel.org
Subject: Re: Potential bug in mballoc --- reusing data blocks before txn commit
Theodore Tso wrote:
> Yeah, I know Andrian Bunk strikes again.... but the right answer is
> to ressurect that code and add it back.
I submitted our patch to re-add this support yesterday.
On Sep 30, 2008 10:15 -0400, Theodore Ts'o wrote:
> On Tue, Sep 30, 2008 at 05:12:12PM +0400, Alex Tomas wrote:
> >> For ext4, the only reason to use a tree would be to allow us to merge
> >> deleted extents. This might not be worth the complexity, though, I
> >> admit it.
> >
> > strictly speaking, extents code should have merged them at allocation time.
>
> Sorry, I wasn't being clear enough. I was thinking of the scenario
> where the user runs "rm -r" and deletes a directory hierarchy with
> lots of small files. So the merging I was talking about was between
> blocks belonging to different files, so we can send a single large
> "trim" command to the disk.
I agree that this is probably most efficient. There would be an rbtree
per transaction, and would mostly be sparse.
> > btw, I've just remembered why I decided don't protect data from reallocation:
> > in data=writeback one can get block with stale data easily. and many people
> > (to my knowledge) were using data=writeback as performing better.
>
> Well, data=ordered is the default, so there would be many more people
> using data=ordered. If we think there is a significant advantage in
> not protecting data from reallocation besides the memory utilization,
> I suppose we could make protecting data being conditional on
> data=writeback. Perhaps having the additional data blocks available
> to the block allocator could allow it to make better decisions. Not
> sure it's worth it, though. Any thoughts?
I think for minimum complexity we should keep the rbtree all the time.
We would need it for the TRIM support in any case.
Having it enabled for ordered mode is fairly important, and I didn't
know that this little surprise was in the mballoc code at all. It may
explain some rare problems we've seen in the past.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists