lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49848A11.7020908@cosmosbay.com>
Date:	Sat, 31 Jan 2009 18:27:45 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Stephen Hemminger <shemminger@...tta.com>
CC:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH 6/6] netfilter: convert x_tables to use RCU

Stephen Hemminger a écrit :
> Replace existing reader/writer lock with Read-Copy-Update to
> elminate the overhead of a read lock on each incoming packet.
> This should reduce the overhead of iptables especially on SMP
> systems.
> 
> The previous code used a reader-writer lock for two purposes.
> The first was to ensure that the xt_table_info reference was not in
> process of being changed. Since xt_table_info is only freed via one
> routine, it was a direct conversion to RCU.
> 
> The other use of the reader-writer lock was to to block changes
> to counters while they were being read. This synchronization was
> fixed by the previous patch.  But still need to make sure table info
> isn't going away.
> 
> Signed-off-by: Stephen Hemminger <shemminger@...tta.com>
> 
> 
> ---
>  include/linux/netfilter/x_tables.h |   10 ++++++--
>  net/ipv4/netfilter/arp_tables.c    |   16 ++++++-------
>  net/ipv4/netfilter/ip_tables.c     |   27 +++++++++++-----------
>  net/ipv6/netfilter/ip6_tables.c    |   16 ++++++-------
>  net/netfilter/x_tables.c           |   45 ++++++++++++++++++++++++++-----------
>  5 files changed, 70 insertions(+), 44 deletions(-)

OK, I have a last comment Stephen

(I tested your patches on my dev machine and got nice results)

In net/ipv4/netfilter/ip_tables.c,
get_counters() can be quite slow on large firewalls, and it is
called from alloc_counters() with BH disabled, but also from __do_replace()
without BH disabled.

So get_counters() first iterates on set_entry_to_counter() for current cpu
but might be interrupted and we get bad results...

Solution is to always use seqcount_t protection, and no BH disabling at all.

Then, I wonder if all this is valid, since alloc_counters() might be supposed
to perform an atomic snapshots of all counters. This was possible because
of a write_lock_bh(), but does it make any sense to block all packets for
such a long time ??? If this is a strong requirement, I am afraid we have
to make other changes...


Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
---
 net/ipv4/netfilter/ip_tables.c |   25 +++----------------------
 1 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 7c50e2e..d208b25 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -900,25 +900,8 @@ get_counters(const struct xt_table_info *t,
 {
 	unsigned int cpu;
 	unsigned int i;
-	unsigned int curcpu;
-
-	/* Instead of clearing (by a previous call to memset())
-	 * the counters and using adds, we set the counters
-	 * with data used by 'current' CPU
-	 * We dont care about preemption here.
-	 */
-	curcpu = raw_smp_processor_id();
-
-	i = 0;
-	IPT_ENTRY_ITERATE(t->entries[curcpu],
-			  t->size,
-			  set_entry_to_counter,
-			  counters,
-			  &i);
 
 	for_each_possible_cpu(cpu) {
-		if (cpu == curcpu)
-			continue;
 		i = 0;
 		IPT_ENTRY_ITERATE(t->entries[cpu],
 				  t->size,
@@ -939,15 +922,12 @@ static struct xt_counters * alloc_counters(struct xt_table *table)
 	   (other than comefrom, which userspace doesn't care
 	   about). */
 	countersize = sizeof(struct xt_counters) * private->number;
-	counters = vmalloc_node(countersize, numa_node_id());
+	counters = __vmalloc(countersize, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
 
 	if (counters == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	/* First, sum counters... */
-	local_bh_disable();
 	get_counters(private, counters);
-	local_bh_enable();
 
 	return counters;
 }
@@ -1209,7 +1189,8 @@ __do_replace(struct net *net, const char *name, unsigned int valid_hooks,
 	void *loc_cpu_old_entry;
 
 	ret = 0;
-	counters = vmalloc(num_counters * sizeof(struct xt_counters));
+	counters = __vmalloc(num_counters * sizeof(struct xt_counters),
+			     GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
 	if (!counters) {
 		ret = -ENOMEM;
 		goto out;

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ