[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181128010903epcms1p7ef44541625212f6af3fc2d3e0d7eb390@epcms1p7>
Date: Wed, 28 Nov 2018 10:09:03 +0900
From: 배석진 <soukjin.bae@...sung.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"lorenzo@...gle.com" <lorenzo@...gle.com>
CC: 배석진 <soukjin.bae@...sung.com>,
유금환 <geumhwan.yu@...sung.com>
Subject: FW: [Resource Leak] Suggesting patch for tcp_close
Hello all,
nobody concern about this? minor problem? :(
we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.
and easily reproduced in wifi network because of android gms... :p
i heard that gms try connect to their server to confirm wifi connection.
and they can't recieve the fin-ack frequently.
thanks,
--------- Original Message ---------
Sender : 배석진 <soukjin.bae@...sung.com> Staff Engineer/System개발2그룹(무선)/삼성전자
Date : 2018-11-23 16:22 (GMT+9)
Title : Suggesting patch for tcp_close
Dear all,
This is Soukin Bae working on Samsung Elec. Mobile Division.
we have a problem with tcp closing.
in shortly,
1. on 4-way handshking to close session
2. if ack pkt is not arrived from opposite side
3. then session can't be closed forever
in mobile device, condition 2 can be happend in various case.
like as turn off wifi or mobile data. or bad condition of air network, etc..
this could be occur in both side of connection.
when issue happened during active closing, the session remained with FIN_WAIT1 state.
and at passive closing, remained with LAST_ACK state.
---------------------------------------------------------------------------------------------
below is test result after wifi on/off repetition (without mobile data).
maybe 'Foreign Address' sent the fin-ack when wifi-off state.
so device coun't recieve ack pkt further, and the session is remained permanently.
and their count is growing up. this is resource leak.
### turn on wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program Name
tcp 0 0 127.0.0.1:5037 0.0.0.0:* LISTEN 0 36357 6907/adbd
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660 2404:6800:4008:c00::bc:5228 ESTABLISHED 10041 74347 6523/com.google.android.gms.persistent
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148 2404:6800:4004:800::2003:80 LAST_ACK 0 0 -
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512 64:ff9b::3444:f3dc:443 ESTABLISHED 10137 77447 9522/com.samsung.android.game.gos
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294 2404:6800:4005:80c::2004:443 LAST_ACK 0 0 -
tcp6 1 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260 64:ff9b::34d0:9421:80 LAST_ACK 0 0 -
### turn off wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program Name
tcp 0 0 127.0.0.1:5037 0.0.0.0:* LISTEN 0 36357 6907/adbd
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148 2404:6800:4004:800::2003:80 LAST_ACK 0 0 -
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294 2404:6800:4005:80c::2004:443 LAST_ACK 0 0 -
tcp6 1 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260 64:ff9b::34d0:9421:80 LAST_ACK 0 0 -
---------------------------------------------------------------------------------------------
this is our analysis
when app finished using the socket(tcp session), it calls sock_close.
then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan().
11-23 11:40:55.676 [5: Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003]
11-23 11:40:55.676 [5: Thread-44:11210] Call trace:
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008ffdda8>] tcp_set_state+0x1b8/0x1f0
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008ffe3f8>] tcp_close+0x484/0x534
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff800902f830>] inet_release+0x60/0x74
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8009074238>] inet6_release+0x30/0x48
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008f35e4c>] __sock_release+0x40/0x104
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008f3bab0>] sock_close+0x18/0x28
11-23 11:40:55.678 [5: Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]
at this point, if the FIN_ACK comes, there's no problem. all is well~
but without that and when turn off wifi,
netd trying to close all the session by calling tcp_abort, sock_diag_destory.
11-23 11:41:38.463 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4004:800::2003]
11-23 11:41:38.464 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4005:80c::2004]
11-23 11:41:38.464 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>, [64:ff9b::34d0:9421]
but because of this sock was already changed to SOCK_DEAD state by tcp_close(), tcp_done() can't be excuted.
so this session can't be closed.
int tcp_abort(struct sock *sk, int err)
{
...
if (!sock_flag(sk, SOCK_DEAD)) { //// when SOCK_DEAD, tcp_done() be skip.
...
sk->sk_error_report(sk);
if (tcp_need_reset(sk->sk_state))
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_done(sk);
}
...
return 0;
}
---------------------------------------------------------------------------------------------
we thought that just sk_error_report have to reside in that condition, SOCK_DEAD.
and send-reset and tcp_done should to be always.
we fixed it like as below, and confirmed that issue resolved.
please check this.
Best regards,
From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
From: soukjin bae <soukjin.bae@...sung.com>
Date: Fri, 23 Nov 2018 15:56:53 +0900
Subject: [PATCH] net: close session always when tcp_abort
session before recieve the FIN_ACK couldn't be closed by tcp_abort
they will remained permanently. close them
Signed-off-by: soukjin bae <soukjin.bae@...sung.com>
Signed-off-by: geumhwan yu <geumhwan.yu@...sung.com>
---
net/ipv4/tcp.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9e6bc4d6daa7..faf4a8bbec8e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
/* This barrier is coupled with smp_rmb() in tcp_poll() */
smp_wmb();
sk->sk_error_report(sk);
- if (tcp_need_reset(sk->sk_state))
- tcp_send_active_reset(sk, GFP_ATOMIC);
- tcp_done(sk);
}
+ if (tcp_need_reset(sk->sk_state))
+ tcp_send_active_reset(sk, GFP_ATOMIC);
+ tcp_done(sk);
+
bh_unlock_sock(sk);
local_bh_enable();
tcp_write_queue_purge(sk);
--
2.13.0
Powered by blists - more mailing lists