[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B20064C.7070301@gmail.com>
Date: Wed, 09 Dec 2009 21:19:24 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Octavian Purdila <opurdila@...acom.com>
CC: netdev@...r.kernel.org,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: [RFC PATCH] llc: convert the socket list to RCU locking (was
Re: [PATCH 3/4] llc: use a device based hash table to speed up multicast
delivery)
Le 08/12/2009 22:10, Octavian Purdila a écrit :
> On Friday 04 December 2009 01:52:44 you wrote:
>
>>> Since at this point we are using UP ports contention is not really an
>>> issue for us. I've extrapolated this (lock per hash bucket) based on how
>>> locking is done in other places, like UDP.
>>
>> Yes but you know we want to remove those locks per UDP hash bucket, since
>> we dont really need them anymore. ;)
>>
>>
>> If you remember, we had in the past one rwlock for the whole UDP table.
>>
>> Then this was converted to one spinlock per hash slot (128 slots) + RCU
>> lookups for unicast RX
>>
>> Then we dynamically sized udp table at boot (up to 65536 slots)
>>
>> multicast optimization (holding lock for small duration + double hashing)
>>
>> bind optimization (thanks to double hashing)
>>
>> To be done :
>>
>> 1) multicast RX can be done without taking any lock, and RCU lookups
>> 2) zap all locks and use one lock, or a small array of hashed spinlocks
>>
>
> OK, here is my first try at llc RCU-fication.
>
> One doubt before pasting the code: In slab.h comment and in udp.c I see the lookup is restarted if an improper object is returned. Is that really required?
>
> Thanks!
> tavi
>
> [RFC PATCH] llc: convert the socket list to RCU locking
>
> For the reclamation phase we use the SLAB_DESTROY_BY_RCU mechanism,
> which require some extra checks in the lookup code:
>
> a) Since socket can be free while looking it up, or getting a
> reference to it, we need to check the socket reference counter to make
> sure the reference we got points to a live socket
>
> b) It can also happen that the reference we got during lookup will be
> freed and the memory reused by another socket. Thus, after getting the
> reference, we must check again that we got the right socket.
>
> Note that the /proc code was not yet converted to RCU, it still uses
> spinlocks for protection.
>
> Signed-off-by: Octavian Purdila <opurdila@...acom.com>
> ---
> include/net/llc.h | 7 ++--
> net/llc/af_llc.c | 1 +
> net/llc/llc_conn.c | 89 ++++++++++++++++++++++++++++++++++------------------
> net/llc/llc_core.c | 5 ++-
> net/llc/llc_proc.c | 20 ++++++------
> net/llc/llc_sap.c | 72 ++++++++++++++++++++++++++++-------------
> 6 files changed, 124 insertions(+), 70 deletions(-)
>
> diff --git a/include/net/llc.h b/include/net/llc.h
> index 7940da1..1559cf1 100644
> --- a/include/net/llc.h
> +++ b/include/net/llc.h
> @@ -16,6 +16,7 @@
> #include <linux/if_ether.h>
> #include <linux/list.h>
> #include <linux/spinlock.h>
> +#include <linux/rculist_nulls.h>
>
> #include <asm/atomic.h>
>
> @@ -53,10 +54,8 @@ struct llc_sap {
> struct net_device *orig_dev);
> struct llc_addr laddr;
> struct list_head node;
> - struct {
> - rwlock_t lock;
> - struct hlist_head list;
> - } sk_list;
> + spinlock_t sk_lock;
> + struct hlist_nulls_head sk_list;
> };
>
> #define LLC_DEST_INVALID 0 /* Invalid LLC PDU type */
> diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
> index ffc96d3..baefb4f 100644
> --- a/net/llc/af_llc.c
> +++ b/net/llc/af_llc.c
> @@ -140,6 +140,7 @@ static struct proto llc_proto = {
> .name = "LLC",
> .owner = THIS_MODULE,
> .obj_size = sizeof(struct llc_sock),
> + .slab_flags = SLAB_DESTROY_BY_RCU,
> };
>
> /**
> diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c
> index c6bab39..8331fc8 100644
> --- a/net/llc/llc_conn.c
> +++ b/net/llc/llc_conn.c
> @@ -468,6 +468,19 @@ static int llc_exec_conn_trans_actions(struct sock *sk,
> return rc;
> }
>
> +static inline bool llc_estab_match(const struct llc_sap *sap,
> + const struct llc_addr *daddr,
> + const struct llc_addr *laddr,
> + const struct sock *sk)
> +{
> + struct llc_sock *llc = llc_sk(sk);
> +
> + return llc->laddr.lsap == laddr->lsap &&
> + llc->daddr.lsap == daddr->lsap &&
> + llc_mac_match(llc->laddr.mac, laddr->mac) &&
> + llc_mac_match(llc->daddr.mac, daddr->mac);
> +}
> +
> /**
> * __llc_lookup_established - Finds connection for the remote/local sap/mac
> * @sap: SAP
> @@ -484,23 +497,24 @@ static struct sock *__llc_lookup_established(struct llc_sap *sap,
> struct llc_addr *laddr)
> {
> struct sock *rc;
> - struct hlist_node *node;
> -
> - read_lock(&sap->sk_list.lock);
> - sk_for_each(rc, node, &sap->sk_list.list) {
> - struct llc_sock *llc = llc_sk(rc);
> -
> - if (llc->laddr.lsap == laddr->lsap &&
> - llc->daddr.lsap == daddr->lsap &&
> - llc_mac_match(llc->laddr.mac, laddr->mac) &&
> - llc_mac_match(llc->daddr.mac, daddr->mac)) {
> - sock_hold(rc);
> + struct hlist_nulls_node *node;
> +
> + rcu_read_lock();
> + sk_nulls_for_each_rcu(rc, node, &sap->sk_list) {
> + if (llc_estab_match(sap, daddr, laddr, rc)) {
> + /* Extra checks required by SLAB_DESTROY_BY_RCU */
> + if (unlikely(!atomic_inc_not_zero(&rc->sk_refcnt)))
> + continue;
Hmm, this wont work in fact, because if we have several llc_sap allocated on machine,
we have no guarantee a freed socket wont be reused and inserted in another llc_sap list.
So before calling llc_estab_match() (and/or llc_listener_match()) you should
check if 'rc' really belong to current sap.
If not, you must restart the lookup (you hit a socket that was moved to another sap)
> + if (unlikely(!llc_estab_match(sap, daddr, laddr, rc))) {
> + sock_put(rc);
> + continue;
> + }
> goto found;
> }
> }
> rc = NULL;
> found:
> - read_unlock(&sap->sk_list.lock);
> + rcu_read_unlock();
> return rc;
> }
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists