netdev - PATCH] filter: fix sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1291582432.2806.300.camel@edumazet-laptop>
Date:	Sun, 05 Dec 2010 21:53:52 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	hagen@...u.net, xiaosuo@...il.com, wirelesser@...il.com,
	netdev@...r.kernel.org, Pavel Emelyanov <xemul@...nvz.org>,
	stable@...nel.org, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: PATCH] filter: fix sk_filter rcu handling

Le vendredi 03 décembre 2010 à 07:32 +0100, Eric Dumazet a écrit :
> Le mercredi 01 décembre 2010 à 10:18 -0800, David Miller a écrit :
> 
> > However, I think it's still valuable to write a few JIT compilers for
> > the existing BPF stuff.  I considered working on a sparc64 JIT just to
> > see what it would look like.
> > 
> > If people work on the BPF optimizer and BPF JITs in parallel, we'll have
> > both ready at the same time.  win++
> 
> I began work on implementing a BPF JIT for x86_64
> 
> My plan is to use external helpers to load skb data/metadata, to keep
> BPF program very short and have no dependencies against struct layouts.
> 
> These helpers would be the three load_word, load_half, load_byte.
> 
> In case the bits are in skb head, these helpers should be fast.
> 
> For practical reasons, they would be in ASM for their fast path, and C
> for the slow path. They are ASM because they are able to perform the
> shortcut (in case of error, doing the stack unwind to perform the
> "return 0;") so that we dont have to test their return from the JIT
> program.
> 
> 

While working on this, I found an annoying problem with current code.

This patch is a stable candidate.

Thanks


[PATCH] filter: fix sk_filter rcu handling

Pavel Emelyanov tried to fix a race between sk_filter_(de|at)tach and
sk_clone() in commit 47e958eac280c263397

Problem is we can have several clones sharing a common sk_filter, and
these clones might want to sk_filter_attach() their own filters at the
same time, and can overwrite old_filter->rcu, corrupting RCU queues.

We can not use filter->rcu without being sure no other thread could do
the same thing.

Switch code to a more conventional ref-counting technique : Do the
atomic decrement immediately and queue one rcu call back when last
reference is released.

Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
Cc: Pavel Emelyanov <xemul@...nvz.org>
Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Cc: stable@...nel.org
---
 include/net/sock.h |    4 +++-
 net/core/filter.c  |   19 ++++++-------------
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index a6338d0..4308af7 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1155,6 +1155,8 @@ extern void sk_common_release(struct sock *sk);
 /* Initialise core socket variables */
 extern void sock_init_data(struct socket *sock, struct sock *sk);
 
+extern void sk_filter_release_rcu(struct rcu_head *rcu);
+
 /**
  *	sk_filter_release - release a socket filter
  *	@fp: filter to remove
@@ -1165,7 +1167,7 @@ extern void sock_init_data(struct socket *sock, struct sock *sk);
 static inline void sk_filter_release(struct sk_filter *fp)
 {
 	if (atomic_dec_and_test(&fp->refcnt))
-		kfree(fp);
+		call_rcu_bh(&fp->rcu, sk_filter_release_rcu);
 }
 
 static inline void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp)
diff --git a/net/core/filter.c b/net/core/filter.c
index c1ee800..ae21a0d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -589,23 +589,16 @@ int sk_chk_filter(struct sock_filter *filter, int flen)
 EXPORT_SYMBOL(sk_chk_filter);
 
 /**
- * 	sk_filter_rcu_release - Release a socket filter by rcu_head
+ * 	sk_filter_release_rcu - Release a socket filter by rcu_head
  *	@rcu: rcu_head that contains the sk_filter to free
  */
-static void sk_filter_rcu_release(struct rcu_head *rcu)
+void sk_filter_release_rcu(struct rcu_head *rcu)
 {
 	struct sk_filter *fp = container_of(rcu, struct sk_filter, rcu);
 
-	sk_filter_release(fp);
-}
-
-static void sk_filter_delayed_uncharge(struct sock *sk, struct sk_filter *fp)
-{
-	unsigned int size = sk_filter_len(fp);
-
-	atomic_sub(size, &sk->sk_omem_alloc);
-	call_rcu_bh(&fp->rcu, sk_filter_rcu_release);
+	kfree(fp);
 }
+EXPORT_SYMBOL(sk_filter_release_rcu);
 
 /**
  *	sk_attach_filter - attach a socket filter
@@ -649,7 +642,7 @@ int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
 	rcu_assign_pointer(sk->sk_filter, fp);
 
 	if (old_fp)
-		sk_filter_delayed_uncharge(sk, old_fp);
+		sk_filter_uncharge(sk, old_fp);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(sk_attach_filter);
@@ -663,7 +656,7 @@ int sk_detach_filter(struct sock *sk)
 					   sock_owned_by_user(sk));
 	if (filter) {
 		rcu_assign_pointer(sk->sk_filter, NULL);
-		sk_filter_delayed_uncharge(sk, filter);
+		sk_filter_uncharge(sk, filter);
 		ret = 0;
 	}
 	return ret;


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html