[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1300358140.3133.117.camel@edumazet-laptop>
Date: Thu, 17 Mar 2011 11:35:40 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>,
Jan Engelhardt <jengelh@...ozas.de>
Cc: kaber@...sh.net, netfilter-devel@...r.kernel.org,
netdev@...r.kernel.org, hawk@...u.dk
Subject: [PATCH] netfilter: xtables: fix reentrancy
Le mercredi 16 mars 2011 à 13:16 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Wed, 16 Mar 2011 20:00:05 +0100
>
> > We currently use a percpu spinlock to 'protect' rule bytes/packets
> > counters, after various attempts to use RCU instead.
> >
> > Lately we added a seqlock so that get_counters() can run without
> > blocking BH or 'writers'. But we really use the seqcount in it.
> >
> > Spinlock itself is only locked by the current cpu, so we can remove it
> > completely.
> >
> > This cleanups api, using correct 'writer' vs 'reader' semantic.
> >
> > At replace time, the get_counters() call makes sure all cpus are done
> > using the old table.
> >
> > We could probably avoid blocking BH (we currently block them in xmit
> > path), but thats a different topic ;)
> >
> > Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
>
> FWIW, I think this is a great idea.
I knew you would be interested :)
While looking at it (and trying to only require preemption disabled
instead of BH disabled), I believe stackptr management is not safe.
I suggest following patch to make sure we restore *stackptr to origptr
before enabling BH (or preemption later)
Thanks
[PATCH] netfilter: xtables: fix reentrancy
commit f3c5c1bfd4308 (make ip_tables reentrant) introduced a race in
handling the stackptr restore, at the end of ipt_do_table()
We should do it before the call to xt_info_rdunlock_bh(), or we allow
cpu preemption and another cpu overwrites stackptr of original one.
A second fix is to change the underflow test to check the origptr value
instead of 0 to detect underflow, or else we allow a jump from different
hooks.
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
Cc: Jan Engelhardt <jengelh@...ozas.de>
Cc: Patrick McHardy <kaber@...sh.net>
---
net/ipv4/netfilter/ip_tables.c | 4 ++--
net/ipv6/netfilter/ip6_tables.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index b09ed0d..ffcea0d 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -387,7 +387,7 @@ ipt_do_table(struct sk_buff *skb,
verdict = (unsigned)(-v) - 1;
break;
}
- if (*stackptr == 0) {
+ if (*stackptr <= origptr) {
e = get_entry(table_base,
private->underflow[hook]);
pr_debug("Underflow (this is normal) "
@@ -427,10 +427,10 @@ ipt_do_table(struct sk_buff *skb,
/* Verdict */
break;
} while (!acpar.hotdrop);
- xt_info_rdunlock_bh();
pr_debug("Exiting %s; resetting sp from %u to %u\n",
__func__, *stackptr, origptr);
*stackptr = origptr;
+ xt_info_rdunlock_bh();
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
#else
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index c9598a9..0b2af9b 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -410,7 +410,7 @@ ip6t_do_table(struct sk_buff *skb,
verdict = (unsigned)(-v) - 1;
break;
}
- if (*stackptr == 0)
+ if (*stackptr <= origptr)
e = get_entry(table_base,
private->underflow[hook]);
else
@@ -441,8 +441,8 @@ ip6t_do_table(struct sk_buff *skb,
break;
} while (!acpar.hotdrop);
- xt_info_rdunlock_bh();
*stackptr = origptr;
+ xt_info_rdunlock_bh();
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists