[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 03 Mar 2011 07:51:47 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: netdev@...r.kernel.org
Subject: Re: inetpeer with create==0
Le mercredi 02 mars 2011 à 20:45 -0800, David Miller a écrit :
> Eric, I was profiling the non-routing-cache case and something that stuck
> out is the case of calling inet_getpeer() with create==0.
>
> If an entry is not found, we have to redo the lookup under a spinlock
> to make certain that a concurrent writer rebalancing the tree does
> not "hide" an existing entry from us.
>
> This makes the case of a create==0 lookup for a not-present entry
> really expensive. It is on the order of 600 cpu cycles on my
> Niagara2.
>
Well, your test assumes all data is already on cpu caches ?
I'll take a look, but my reasoning was that the real cost in DDOS
situation is to bring data into caches. With a 20 depth, cache misses
costs are the problem.
The second lookup was basically free.
> I added a hack to not do the relookup under the lock when create==0
> and it now costs less than 300 cycles.
>
Hmm... I'm curious you send this hack to me ;)
> This is now a pretty common operation with the way we handle COW'd
> metrics, so I think it's definitely worth optimizing.
>
> I looked at the generic radix tree implementation, and it supports
> full RCU lookups in parallel with insert/delete. It handles the race
> case without the relookup under lock because it creates fixed paths
> to "slots" where nodes live using shifts and masks. So if a path
> to a slot ever existed, it will always exist.
>
> Take a look at lib/radix-tree.c and include/linux/radix-tree.h if
> you are curious.
>
> I think we should do something similar for inetpeer. Currently we
> cannot just use the existing generic radix-tree code because it only
> supports indexes as large as "unsigned long" and we need to handle
> 128-bit ipv6 addresses.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists