[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200702191614.49666.dada1@cosmosbay.com>
Date: Mon, 19 Feb 2007 16:14:49 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Evgeniy Polyakov <johnpol@....mipt.ru>
Cc: akepner@....com, linux@...izon.com, davem@...emloft.net,
netdev@...r.kernel.org, bcrl@...ck.org
Subject: Re: Extensible hashing and RCU
On Monday 19 February 2007 15:25, Evgeniy Polyakov wrote:
> On Mon, Feb 19, 2007 at 03:14:02PM +0100, Eric Dumazet (dada1@...mosbay.com)
wrote:
> > > Forget about cache misses and cache lines - we have a hash table, only
> > > part of which is used (part for time-wait sockets, part for established
> > > ones).
> >
> > No you didnt not read my mail. Current ehash is not as decribed by you.
>
> I did.
> And I also said that my tests do not have timewait sockets at all - I
> removed sk_for_each and so on, which should effectively increase lookup
> time twice on busy system with lots of created/removed sockets per
> timeframe (that is theory from my side already).
> Anyway, I ran the same test with increased table too.
>
> > > Anyway, even with 2^20 (i.e. when the whole table is only used for
> > > established sockets) search time is about 360-370 nsec on 3.7 GHz Core
> > > Duo (only one CPU is used) with 2 GB of ram.
> >
> > Your tests are user land, so unfortunatly are biased...
> >
> > (Unless you use hugetlb data ?)
>
> No I do not. But the same can be applied to trie test - it is also
> performed in userspace and thus suffers from possible swapping/cache
> flushing and so on.
>
> And I doubt moving test into kernel will suddenly end up with 10 times
> increased rates.
At least some architectures pay a high price using vmalloc() instead of
kmalloc(), and TLB misses means something for them. Not everybody has the
latest Intel cpu. Normally, ehash table is using huge pages.
>
> Anyway, trie test (broken implementation) is two times slower than hash
> table (resized already), and it does not include locking isses of the
> hash table access and further scalability issues.
>
You mix apples and oranges. We already know locking has nothing to do with
hashing or trie-ing. We *can* put RCU on top of the existing ehash. We also
can add hash resizing if we really care.
> I think I need to fix my trie implementation to fully show its
> potential, but original question was why tree/trie based implementation
> is not considered at all although it promises better performance and
> scalability.
Because you mix performance and scalability. Thats not exactly the same.
Sometime, high performance means *suboptimal* scalability.
Because O(1) is different of O(log(N)) ?
if N = 2^20, it certainly makes a difference.
Yes, 1% of chains might have a length > 10, but yet 99% of the lookups are
touching less than 4 cache lines.
With a binary tree, log(2^20) is 20. or maybe not ? If you tell me it's 4, I
will be very pleased.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists