[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <472A12D0.4070401@cosmosbay.com>
Date: Thu, 01 Nov 2007 18:54:24 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Stephen Hemminger <shemminger@...ux-foundation.org>
CC: "David S. Miller" <davem@...emloft.net>,
Linux Netdev List <netdev@...r.kernel.org>,
Andi Kleen <ak@...e.de>,
Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table
Stephen Hemminger a écrit :
> On Thu, 01 Nov 2007 11:16:20 +0100
> Eric Dumazet <dada1@...mosbay.com> wrote:
>
>> As done two years ago on IP route cache table (commit
>> 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per
>> hash bucket for the huge TCP/DCCP hash tables.
>>
>> On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle
>> performance differences. (we hit a different cache line for the rwlock, but
>> then the bucket cache line have a better sharing factor among cpus, since we
>> dirty it less often)
>>
>> Using a 'small' table of hashed rwlocks should be more than enough to provide
>> correct SMP concurrency between different buckets, without using too much
>> memory. Sizing of this table depends on NR_CPUS and various CONFIG settings.
>>
>> This patch provides some locking abstraction that may ease a future work using
>> a different model for TCP/DCCP table.
>>
>> Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
>>
>> include/net/inet_hashtables.h | 40 ++++++++++++++++++++++++++++----
>> net/dccp/proto.c | 16 ++++++++++--
>> net/ipv4/inet_diag.c | 9 ++++---
>> net/ipv4/inet_hashtables.c | 7 +++--
>> net/ipv4/inet_timewait_sock.c | 13 +++++-----
>> net/ipv4/tcp.c | 11 +++++++-
>> net/ipv4/tcp_ipv4.c | 11 ++++----
>> net/ipv6/inet6_hashtables.c | 19 ++++++++-------
>> 8 files changed, 89 insertions(+), 37 deletions(-)
>>
>
> Longterm is there any chance of using rcu for this? Seems like
> it could be a big win.
>
This was discussed in the past, and I even believe some patch was proposed,
but some guys (including David) complained that RCU is well suited for 'mostly
read' structures.
On some web server workloads, TCP hash table is constantly accessed in write
mode (socket creation, socket move to timewait state, socket deleted...), and
RCU added overhead and poor cache re-use (because sockets must be placed on
RCU queue before reuse)
On these typical workload, hash table without RCU is still the best.
Longterm changes would rather be based on Robert Olsson suggestion last year
(trie based lookups and unified IP/TCP cache)
Short term changes would be to be able to resize the TCP hash table (being
small at boot, and be able to grow it if necessary). Its current size on
modern machines is just insane.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists