[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071218033249.GQ7070@thunk.org>
Date: Mon, 17 Dec 2007 22:32:49 -0500
From: Theodore Tso <tytso@....edu>
To: linux-ext4@...r.kernel.org, Eric Sandeen <esandeen@...hat.com>
Subject: Re: What's cooking in e2fsprogs.git (topics)
On Mon, Dec 17, 2007 at 04:36:34PM -0700, Andreas Dilger wrote:
> > But the performance problems are starting to make me worry. Do you
> > know how many tdb entries you had before tdb performance started going
> > really badly down the toilet? I wonder if there are some tuning knobs
> > we could tweak to the performance numbers.
>
> There is some test data at
> https://bugzilla.lustre.org/attachment.cgi?id=13924 which is a PDF
> file. This shows 1000 items is reasonable, and 10000 is not.
I did some research, and the problem is that tdb uses a fixed number
of buckets for its hash size. By default it is 131 hash buckets, but
you can pass in a different hash size when you create the tdb table.
So with 10,000 items, you will have an average of 76 objects per hash
chain, and that doesn't work terribly well, obviously. Berkdb's hash
method uses an extensible hashing system which increases number of
bits that are used in the hash, and breaks up the hash buckets as they
get too big, which is a much nicer self-tuning algorithm. With tdb,
you need to know from the get-go how much stuff you're likely going to
be storing in the tdb system, and adjust your hash size accordingly.
> The majority of the time is taken looking up existing entries, and this
> is due to one unusual requirement of the Lustre usage to be notified
> of duplicate insertions to detect duplicate use of objects, so this may
> have been a major factor in the slowdown. It isn't really practical to
> use a regular libext2fs bitmap for our case, since the collision space
> is a 64-bit integer, but maybe we could have done this with an RB tree
> or some other mechanism.
Well, if you only need an in-core data structure, and it doesn't need
to be stored on disk, have you looked at e2fsck/dict.c, which was
lifted from Kazlib? It's a userspace, single file, in-memory only RB
tree implementation.
Regards,
- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists