lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 3 Mar 2010 16:36:23 +0800 From: Zhu Yi <yi.zhu@...el.com> To: netdev@...r.kernel.org Cc: Zhu Yi <yi.zhu@...el.com>, David Miller <davem@...emloft.net>, Arnaldo Carvalho de Melo <acme@...stprotocols.net>, "Pekka Savola (ipv6)" <pekkas@...core.fi>, Patrick McHardy <kaber@...sh.net>, Vlad Yasevich <vladislav.yasevich@...com>, Sridhar Samudrala <sri@...ibm.com>, Jon Maloy <jon.maloy@...csson.com>, Allan Stephens <allan.stephens@...driver.com>, Andrew Hendry <andrew.hendry@...il.com>, Eric Dumazet <eric.dumazet@...il.com> Subject: [PATCH V2 1/7] net: add limit for socket backlog We got system OOM while running some UDP netperf testing on the loopback device. The case is multiple senders sent stream UDP packets to a single receiver via loopback on local host. Of course, the receiver is not able to handle all the packets in time. But we surprisingly found that these packets were not discarded due to the receiver's sk->sk_rcvbuf limit. Instead, they are kept queuing to sk->sk_backlog and finally ate up all the memory. We believe this is a secure hole that a none privileged user can crash the system. The root cause for this problem is, when the receiver is doing __release_sock() (i.e. after userspace recv, kernel udp_recvmsg -> skb_free_datagram_locked -> release_sock), it moves skbs from backlog to sk_receive_queue with the softirq enabled. In the above case, multiple busy senders will almost make it an endless loop. The skbs in the backlog end up eat all the system memory. The issue is not only for UDP. Any protocols using socket backlog is potentially affected. The patch adds limit for socket backlog so that the backlog size cannot be expanded endlessly. Reported-by: Alex Shi <alex.shi@...el.com> Cc: David Miller <davem@...emloft.net> Cc: Arnaldo Carvalho de Melo <acme@...stprotocols.net> Cc: Alexey Kuznetsov <kuznet@....inr.ac.ru Cc: "Pekka Savola (ipv6)" <pekkas@...core.fi> Cc: Patrick McHardy <kaber@...sh.net> Cc: Vlad Yasevich <vladislav.yasevich@...com> Cc: Sridhar Samudrala <sri@...ibm.com> Cc: Jon Maloy <jon.maloy@...csson.com> Cc: Allan Stephens <allan.stephens@...driver.com> Cc: Andrew Hendry <andrew.hendry@...il.com> Signed-off-by: Zhu Yi <yi.zhu@...el.com> Signed-off-by: Eric Dumazet <eric.dumazet@...il.com> --- include/net/sock.h | 15 ++++++++++++++- net/core/sock.c | 16 ++++++++++++++-- 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 6cb1676..2516d76 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -253,6 +253,8 @@ struct sock { struct { struct sk_buff *head; struct sk_buff *tail; + int len; + int limit; } sk_backlog; wait_queue_head_t *sk_sleep; struct dst_entry *sk_dst_cache; @@ -589,7 +591,7 @@ static inline int sk_stream_memory_free(struct sock *sk) return sk->sk_wmem_queued < sk->sk_sndbuf; } -/* The per-socket spinlock must be held here. */ +/* OOB backlog add */ static inline void sk_add_backlog(struct sock *sk, struct sk_buff *skb) { if (!sk->sk_backlog.tail) { @@ -601,6 +603,17 @@ static inline void sk_add_backlog(struct sock *sk, struct sk_buff *skb) skb->next = NULL; } +/* The per-socket spinlock must be held here. */ +static inline int sk_add_backlog_limited(struct sock *sk, struct sk_buff *skb) +{ + if (sk->sk_backlog.len >= max(sk->sk_backlog.limit, sk->sk_rcvbuf << 1)) + return -ENOBUFS; + + sk_add_backlog(sk, skb); + sk->sk_backlog.len += skb->truesize; + return 0; +} + static inline int sk_backlog_rcv(struct sock *sk, struct sk_buff *skb) { return sk->sk_backlog_rcv(sk, skb); diff --git a/net/core/sock.c b/net/core/sock.c index fcd397a..6e22dc9 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -340,8 +340,12 @@ int sk_receive_skb(struct sock *sk, struct sk_buff *skb, const int nested) rc = sk_backlog_rcv(sk, skb); mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); - } else - sk_add_backlog(sk, skb); + } else if (sk_add_backlog_limited(sk, skb)) { + bh_unlock_sock(sk); + atomic_inc(&sk->sk_drops); + goto discard_and_relse; + } + bh_unlock_sock(sk); out: sock_put(sk); @@ -1139,6 +1143,7 @@ struct sock *sk_clone(const struct sock *sk, const gfp_t priority) sock_lock_init(newsk); bh_lock_sock(newsk); newsk->sk_backlog.head = newsk->sk_backlog.tail = NULL; + newsk->sk_backlog.len = 0; atomic_set(&newsk->sk_rmem_alloc, 0); /* @@ -1542,6 +1547,12 @@ static void __release_sock(struct sock *sk) bh_lock_sock(sk); } while ((skb = sk->sk_backlog.head) != NULL); + + /* + * Doing the zeroing here guarantee we can not loop forever + * while a wild producer attempts to flood us. + */ + sk->sk_backlog.len = 0; } /** @@ -1874,6 +1885,7 @@ void sock_init_data(struct socket *sock, struct sock *sk) sk->sk_allocation = GFP_KERNEL; sk->sk_rcvbuf = sysctl_rmem_default; sk->sk_sndbuf = sysctl_wmem_default; + sk->sk_backlog.limit = sk->sk_rcvbuf << 1; sk->sk_state = TCP_CLOSE; sk_set_socket(sk, sock); -- 1.6.3.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists