netdev - Re: [PATCH] fix tcp fin memory accounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1429797767.22254.41.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Thu, 23 Apr 2015 07:02:47 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Josh Hunt <johunt@...mai.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	edumazet@...gle.com
Subject: Re: [PATCH] fix tcp fin memory accounting

On Thu, 2015-04-23 at 08:48 -0500, Josh Hunt wrote:
> On 04/21/2015 07:09 PM, Eric Dumazet wrote:
> >
> > Note that this patch adds a deadlock possibility in some stress
> > situations.
> >
> > If a process owning some tcp socket dies, and tcp_mem[2] is already hit,
> > all sk_stream_alloc_skb() can return NULL and we loop in tcp_send_fin(),
> > making no progress because we can not free any tcp memory.
> 
> Ugh. Thanks for fixing this Eric!

No problem ;)

For the record, I've tested this followup patch that I'll formally
submit when net-next reopens :

If there is one already sent skb in write queue
(we look at the tail of course), and :
- We are under TCP memory pressure, 
- or allocation of a fresh skb using GFP_KERNEL failed,

-> we add the FIN flag that will eventually be sent later after a
timeout.

 net/ipv4/tcp_output.c |   41 ++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 2ade67b7cdb0..fe6558eb64f3 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2827,33 +2827,34 @@ static void sk_forced_wmem_schedule(struct sock *sk, int size)
 	sk_memory_allocated_add(sk, amt, &status);
 }
 
-/* Send a fin.  The caller locks the socket for us.  This cannot be
- * allowed to fail queueing a FIN frame under any circumstances.
+/* Send a FIN. The caller locks the socket for us.
+ * We should try to send a FIN packet really hard, but eventually give up.
  */
 void tcp_send_fin(struct sock *sk)
 {
+	struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk);
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct sk_buff *skb = tcp_write_queue_tail(sk);
-	int mss_now;
 
-	/* Optimization, tack on the FIN if we have a queue of
-	 * unsent frames.  But be careful about outgoing SACKS
-	 * and IP options.
+	/* Optimization, tack on the FIN if we have one skb in write queue and
+	 * this skb was not yet sent, or we are under memory pressure.
+	 * Note: in the latter case, FIN packet will be sent after a timeout,
+	 * as TCP stack thinks it has been transmitted once.
 	 */
-	mss_now = tcp_current_mss(sk);
-
-	if (tcp_send_head(sk)) {
-		TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_FIN;
-		TCP_SKB_CB(skb)->end_seq++;
+	if (tskb && (tcp_send_head(sk) || sk_under_memory_pressure(sk))) {
+coalesce:
+		TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;
+		TCP_SKB_CB(tskb)->end_seq++;
 		tp->write_seq++;
+		if (!tcp_send_head(sk)) {
+			tp->snd_nxt++;
+			return;
+		}
 	} else {
-		/* Socket is locked, keep trying until memory is available. */
-		for (;;) {
-			skb = alloc_skb_fclone(MAX_TCP_HEADER,
-					       sk->sk_allocation);
-			if (skb)
-				break;
-			yield();
+		skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation);
+		if (unlikely(!skb)) {
+			if (tskb)
+				goto coalesce;
+			return;
 		}
 		skb_reserve(skb, MAX_TCP_HEADER);
 		sk_forced_wmem_schedule(sk, skb->truesize);
@@ -2862,7 +2863,7 @@ void tcp_send_fin(struct sock *sk)
 				     TCPHDR_ACK | TCPHDR_FIN);
 		tcp_queue_skb(sk, skb);
 	}
-	__tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF);
+	__tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF);
 }
 
 /* We get here when a process closes a file descriptor (either due to


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html