[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45D89EFE.4080103@cosmosbay.com>
Date: Sun, 18 Feb 2007 19:46:22 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Evgeniy Polyakov <johnpol@....mipt.ru>
CC: akepner@....com, linux@...izon.com, davem@...emloft.net,
netdev@...r.kernel.org, bcrl@...ux.intel.com
Subject: Re: Extensible hashing and RCU
Evgeniy Polyakov a e'crit :
> On Mon, Feb 05, 2007 at 10:02:53AM -0800, akepner@....com (akepner@....com) wrote:
>> On Sat, 4 Feb 2007 linux@...izon.com wrote:
>>
>>> I noticed in an LCA talk mention that apprently extensible hashing
>>> with RCU access is an unsolved problem. Here's an idea for solving it.
>>> ....
>> Yes, I have been playing around with the same idea for
>> doing dynamic resizing of the TCP hashtable.
>>
>> Did a prototype "toy" implementation, and I have a
>> "half-finished" patch which resizes the TCP hashtable
>> at runtime. Hmmm, your mail may be the impetus to get
>> me to finally finish this thing....
>
> Why anyone do not want to use trie - for socket-like loads it has
> exactly constant search/insert/delete time and scales as hell.
>
Because we want to be *very* fast. You cannot beat hash table.
Say you have 1.000.000 tcp connections, with 50.000 incoming packets per
second to *random* streams...
With a 2^20 hashtable, a lookup uses one cache line (the hash head pointer)
plus one cache line to get the socket (you need it to access its refcounter)
Several attempts were done in the past to add RCU to ehash table (last done by
Benjamin LaHaise last March). I believe this was delayed a bit, because
David would like to be able to resize the hash table...
I am not really interested in hash resizing, because an admin can size it at
boot time. But RCU is definitly *wanted*
Note : It would be good to also use RCU for UDP, because the current rwlock
protecting udp_hash[] is a scalability problem.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists