[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AF1F273.5020207@gmail.com>
Date: Wed, 04 Nov 2009 22:30:27 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Lucian Adrian Grijincu <lgrijincu@...acom.com>
CC: netdev@...r.kernel.org, Octavian Purdila <opurdila@...acom.com>
Subject: Re: [RFC] [PATCH] udp: optimize lookup of UDP sockets to by including
destination address in the hash key
Lucian Adrian Grijincu a écrit :
> Hi,
>
> Before you look at the patch attached: I know it's not the prettiest of patches sent here. It's a work-in-progress and is ugly and incomplete (for example IPv6 is not properly updated to benefit this patch and may even be as buggy as hell right now).
> I'm sending it here to ask for an opinion on the approach.
>
> The current approach uses one single hash to lookup UDP sockets.
> The key to the hash is calculated based on the port only.
> This works well if there aren't many sockets bound to the same port, but becomes a linear linked list search for a setup with many UPD packets bound to the same port, which does not scale much.
>
> In this patch the hashtable lookup key takes into consideration the address the socket is bound to too.
> This will lead to better distribution for setup where there are many UDP sockets bound to the same port but different IP addresses.
>
> There's a subtle bug lurking here: because some sockets may be bound to INADDR_ANY when checking whether a given UPD port is already taken one must search two buckets:
> * one with the key computed from the port and for the IP address of the socket
> * one with the key computed from the port and INADDR_ANY
>
> Now if the address by which we must do the lookup is INADDR_ANY there's another problem: we can't just search it's own bucket and the bucket for INADDR_ANY (the same bucket twice) like above because there might be a socket bound to a specific address and the port we're trying to bind to.
>
> In this case we could search the whole hash (which would take forever) or use another hash that is computed based on the port only (which this patch does).
>
> There are other things to look after: when modifying the hashes we must take spin_locks or spin_lock_bhs. Because we now have two hashtables and one hashtable might be searched twice locks must be taken in the proper order.
>
> I ran the attached UDP tester: it's a program that binds to a lot of UDP sockets and if run as a sender it sends lots of datagrams to the receiver which will just count the number of packets recevied.
>
>
>
> == Performance stats ==
> I ran this on a 2.6.31 with and without the patch on two powerpc single core processors connected directly by 100Mbps ethernet.
>
> * nsocks = number of IPv4 bound sockets
> * ndgrams = number of datagrams sent by each socket
> * payloadlen = the lenght of the UPD message sent
> * send_time = how many second did the sending operation take?
> * recv_dgrams = how many datagrams were successfully received?
>
>
> 1. nsocks=512 ndgrams=2000 payloadlen=1
> - without patch:
> * send_time = 27s to 31s
> * recv_dgrams = 550 to 850
> - with patch:
> * send_time = 35s to 36s
> * recv_dgrams = 507000 to 516000
>
> 2. nsocks=512 ndgrams=4000 payloadlen=1
> - without patch:
> * send_time = 53s to 60s
> * recv_dgrams = 650 to 1100
> - with patch:
> * send_time = ~70s
> * recv_dgrams = 1010000 to 1034000
>
> 3. nsocks=512 ndgrams=4000 payloadlen=1000
> - without patch:
> * send_time = ~74s
> * recv_dgrams = 650
> - with patch:
> * send_time = ~88s
> * recv_dgrams = 995000
>
> 4. nsocks=256 ndgrams=2000 payloadlen=1
> - without patch:
> * send_time = 13s
> * recv_dgrams = 370 to 720
> - with patch:
> * send_time = 17s
> * recv_dgrams = 267000
>
>
> As you can see this has a hit on total send time, but the number of received packets increases by two orders of magnitude.
>
> I'd like hear your opinion about this approach.
> If you'd like to test this, I'll be happy to port it to net-next and provide the patch.
>
>
I knew someone would do this kind of patch one day, I tried it one year ago :)
First of all, you are mixing several things in this patch.
Dont do this, its not possible for us to correctly review such complex patch.
Then, your patch is not based on net-next-2.6, and you really need to work on this tree.
Then, if you had worked on net-next-2.6, you whould have noticed UDP hash tables
are now dynamically sized at boot.
An admin can even force a 65536 slots hash table for heavy duty UDP servers.
Then, last point : Say I have a machine with 65000 udp sockets bound to a different port,
and a 65536 slots hash table, (sane values in fact, in order to have best performances),
then your two phase lookup will be slower than the one-phase current lookup (two cache misses
instead of one)
So your patch seems to solve a pathological case (where many udp sockets are
bounded to a particular port, but on many different IPs), and slow down 99%
of other uses.
Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists