lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 24 Mar 2009 13:43:54 +0100 From: Patrick McHardy <kaber@...sh.net> To: Eric Dumazet <dada1@...mosbay.com> CC: Joakim Tjernlund <Joakim.Tjernlund@...nsmode.se>, avorontsov@...mvista.com, netdev@...r.kernel.org, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> Subject: Re: [PATCH] conntrack: Reduce conntrack count in nf_conntrack_free() Eric Dumazet wrote: >> In a stress situation, you feed more deleted conntracks to call_rcu() than >> the blimit (10 real freeing per RCU softirq invocation). >> >> So with default qhimark being 10000, this means about 10000 conntracks >> can sit in RCU (per CPU) before being really freed. >> >> Only when hitting 10000, RCU enters a special mode to free all queued items, instead >> of a small batch of 10 >> >> To solve your problem we can : >> >> 1) reduce qhimark from 10000 to 1000 (for example) >> Probably should be done to reduce some spikes in RCU code when freeing >> whole 10000 elements... >> OR >> 2) change conntrack tunable (max conntrack entries on your machine) >> OR >> 3) change net/netfilter/nf_conntrack_core.c to decrement net->ct.count >> in nf_conntrack_free() instead of callback. >> >> [PATCH] conntrack: Reduce conntrack count in nf_conntrack_free() >> >> We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might >> accumulate about 10.000 elements per CPU in its internal queues. To get accurate >> conntrack counts (at the expense of slightly more RAM used), we might consider >> conntrack counter not taking into account "about to be freed elements, waiting >> in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU >> callback. >> >> Signed-off-by: Eric Dumazet <dada1@...mosbay.com> >> >> >> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c >> index f4935e3..6478dc7 100644 >> --- a/net/netfilter/nf_conntrack_core.c >> +++ b/net/netfilter/nf_conntrack_core.c >> @@ -516,16 +516,17 @@ EXPORT_SYMBOL_GPL(nf_conntrack_alloc); >> static void nf_conntrack_free_rcu(struct rcu_head *head) >> { >> struct nf_conn *ct = container_of(head, struct nf_conn, rcu); >> - struct net *net = nf_ct_net(ct); >> >> nf_ct_ext_free(ct); >> kmem_cache_free(nf_conntrack_cachep, ct); >> - atomic_dec(&net->ct.count); >> } >> >> void nf_conntrack_free(struct nf_conn *ct) >> { >> + struct net *net = nf_ct_net(ct); >> + >> nf_ct_ext_destroy(ct); >> + atomic_dec(&net->ct.count); >> call_rcu(&ct->rcu, nf_conntrack_free_rcu); >> } >> EXPORT_SYMBOL_GPL(nf_conntrack_free); > > I forgot to say this is what we do for 'struct file' freeing as well. We > decrement nr_files in file_free(), not in file_free_rcu() While temporarily exceeding the limit by up to 10000 entries is quite a lot, I guess the important thing is that it can't grow unbounded, so I think this patch is fine. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists