[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A5DCB7C.9000502@gmail.com>
Date: Wed, 15 Jul 2009 14:28:44 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>, kaber@...sh.net
CC: netdev@...r.kernel.org, paulmck@...ux.vnet.ibm.com
Subject: [PATCH] net: nf_conntrack_alloc() should not use kmem_cache_zalloc()
David, Patrick
Here is a fix for nf_conntrack, candidate for linux-2.6.31 and stable (linux-2.6.30)
Thank you
[PATCH] net: nf_conntrack_alloc() should not use kmem_cache_zalloc()
When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating
objects, since slab allocator could give a freed object still used by lockless
readers.
In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next
being always valid (ie containing a valid 'nulls' value, or a valid pointer to next
object in hash chain.)
kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid
for ct->tuplehash[xxx].hnnode.next.
Fix is to call kmem_cache_alloc() and do the zeroing ourself.
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7508f11..23feafa 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -561,17 +561,28 @@ struct nf_conn *nf_conntrack_alloc(struct net *net,
}
}
- ct = kmem_cache_zalloc(nf_conntrack_cachep, gfp);
+ /*
+ * Do not use kmem_cache_zalloc(), as this cache uses
+ * SLAB_DESTROY_BY_RCU.
+ */
+ ct = kmem_cache_alloc(nf_conntrack_cachep, gfp);
if (ct == NULL) {
pr_debug("nf_conntrack_alloc: Can't alloc conntrack.\n");
atomic_dec(&net->ct.count);
return ERR_PTR(-ENOMEM);
}
-
+ /*
+ * Let ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.next
+ * and ct->tuplehash[IP_CT_DIR_REPLY].hnnode.next unchanged.
+ */
+ memset(&ct->tuplehash[IP_CT_DIR_MAX], 0,
+ sizeof(*ct) - offsetof(struct nf_conn, tuplehash[IP_CT_DIR_MAX]));
spin_lock_init(&ct->lock);
atomic_set(&ct->ct_general.use, 1);
ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple = *orig;
+ ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.pprev = NULL;
ct->tuplehash[IP_CT_DIR_REPLY].tuple = *repl;
+ ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev = NULL;
/* Don't set timer yet: wait for confirmation */
setup_timer(&ct->timeout, death_by_timeout, (unsigned long)ct);
#ifdef CONFIG_NET_NS
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists