[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49809716.3020204@cosmosbay.com>
Date: Wed, 28 Jan 2009 18:34:14 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Patrick McHardy <kaber@...sh.net>
CC: Rick Jones <rick.jones2@...com>,
Netfilter Developers <netfilter-devel@...r.kernel.org>,
Linux Network Development list <netdev@...r.kernel.org>,
Stephen Hemminger <shemminger@...tta.com>
Subject: Re: 32 core net-next stack/netfilter "scaling"
Eric Dumazet a écrit :
> Patrick McHardy a écrit :
>> Eric Dumazet wrote:
>>> Rick Jones a écrit :
>>>> Anyhow, the spread on trans/s/netperf is now 600 to 500 or 6000, which
>>>> does represent an improvement.
>>>>
>>> Yes indeed you have a speedup, tcp conntracking is OK.
>>>
>>> You now hit the nf_conntrack_lock spinlock we have in generic
>>> conntrack code (net/netfilter/nf_conntrack_core.c)
>>>
>>> nf_ct_refresh_acct() for instance has to lock it.
>>>
>>> We really want some finer locking here.
>> That looks more complicated since it requires to take multiple locks
>> occasionally (f.i. hash insertion, potentially helper-related and
>> expectation-related stuff), and there is the unconfirmed_list, where
>> fine-grained locking can't really be used without changing it to
>> a hash.
>>
>
> Yes its more complicated, but look what we did in 2.6.29 for tcp/udp
> sockets, using RCU to have lockless lookups.
> Yes, we still take a lock when doing an insert or delete at socket
> bind/unbind time.
>
> We could keep a central nf_conntrack_lock to guard insertions/deletes
> from hash and unconfirmed_list.
>
> But *normal* packets that only need to change state of one particular
> connection could use RCU (without spinlock) to locate the conntrack,
> then lock the found conntrack to perform all state changes.
Well... RCU is already used by conntrack :)
Maybe only __nf_ct_refresh_acct() needs not taking nf_conntrack_lock
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists