[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200911051825.45749.opurdila@ixiacom.com>
Date: Thu, 5 Nov 2009 18:25:45 +0200
From: Octavian Purdila <opurdila@...acom.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Lucian Adrian Grijincu <lgrijincu@...acom.com>,
netdev@...r.kernel.org
Subject: Re: [RFC] [PATCH] udp: optimize lookup of UDP sockets to by including destination address in the hash key
On Thursday 05 November 2009 01:32:18 you wrote:
> >
> > Very true, the benchmark itself shows a significant overhead increase on
> > the TX side and indeed this case is not very common. But for us its an
> > important usecase.
> >
> > Maybe there is a more clever way of fixing this specific use-case without
> > hurting the common case?
>
> Clever way ? Well, we will see :)
>
> I now understand previous Lucian patch (best match) :)
>
> Could you please describe your usecase ? I guess something is possible,
> not necessarly hurting performance of regular usecases :)
>
IIRC, we first saw this issue in VoIP tests with up to 16000 sockets bound on a
certain port and IP addresses (each IP address is assigned to a particular
interface). We need this setup in order to emulate lots of VoIP users each
with a different IP address and possible a different L2 encapsulation.
Now, as a general note I should say that our usecases can seem absurd if you
take them out of the network testing field :) but my _personal_ opinion is that
a better integration between our code base and upstream code may benefit both
upstream and us:
- for us it gives the ability to stay close to upstream and get all of the new
shiny features without painful upgrades
- for upstream, even if most systems don't run into these scalability issues
now, I see that some people are moving in that direction (see the recent PPP
problems); also, stressing Linux in that regard can only make the code better
- as long as the approach taken is clean and sound
- we (or our customers) use a plethora of networking devices for testing so
exposing Linux early to those devices can only help catching issues earlier
In short: expect more absurd patches from us :)
> I have struct reorderings in progress to reduce number of cache lines read
> per socket from two to one. So this would reduce by 50% time to find
> a particular socket in the chain.
>
> But if you *really* want/need 512 sockets bound to _same_ port, we probably
> can use secondary hash tables (or rbtree), as soon as we stack more than
> XX sockets on a particular slot.
>
> At lookup, we check if extended hash table exists before doing
> normal rcu lookup.
>
> Probably can be done under 300 lines of code.
> On normal machines, these extra tables/trees would not be used/allocated
>
Yep, that should work. Will respin the patch based on this idea and see what
we get, but it will take a while.
Thanks,
tavi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists