netdev - Re: tcp hang when socket fills up ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180416035546.GA5388@nautica>
Date:   Mon, 16 Apr 2018 05:55:46 +0200
From:   Dominique Martinet <asmadeus@...ewreck.org>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     Michal Kubecek <mkubecek@...e.cz>, netdev@...r.kernel.org
Subject: Re: tcp hang when socket fills up ?

Eric Dumazet wrote on Sun, Apr 15, 2018:
> Are you sure you do not have some iptables/netfilter stuff ?

I have a basic firewall setup with default rules e.g. starts with
-m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
in the INPUT chain...
That said, I just dropped it on the server to check and that seems to
workaround the issue?!
When logging everything dropped it appears to decide that the connection
is no longer established at some point, but only if there is
tcp_timestamp, just, err, how?

And certainly enough, if I restore the firewall while a connection is up
that just hangs; conntrack doesn't consider it connected anymore at some
point (but it worked for a while!)

Here's the kind of logs I get from iptables:
IN=wlp1s0 OUT= MAC=00:c2:c6:b4:7e:c7:a4:12:42:b5:5d:fc:08:00 SRC=client DST=server LEN=52 TOS=0x00 PREC=0x00 TTL=52 ID=17038 DF PROTO=TCP SPT=41558 DPT=15609 WINDOW=1212 RES=0x00 ACK URGP=0 


> ss -temoi might give us more info

hang
ESTAB       0        81406        server:15609 client:41558    users:(("socat",pid=17818,fd=5)) timer:(on,48sec,11) uid:1000 ino:137253 sk:6a <->
	 skmem:(r0,rb369280,t0,tb147456,f2050,w104446,o0,bl0,d1) ts sack
	 reno wscale:7,7 rto:15168 backoff:6 rtt:36.829/6.492 ato:40
	 mss:1374 pmtu:1500 rcvmss:1248 advmss:1448 cwnd:1 ssthresh:16
	 bytes_acked:32004 bytes_received:4189 segs_out:84 segs_in:55
	 data_segs_out:77 data_segs_in:18 send 298.5Kbps lastsnd:12483
	 lastrcv:27801 lastack:27726 pacing_rate 19.1Mbps delivery_rate
	 4.1Mbps busy:28492ms unacked:31 retrans:1/6 lost:31 rcv_rtt:29
	 rcv_space:29200 rcv_ssthresh:39184 notsent:38812 minrtt:25.152

working (tcp_timestamp=0)
ESTAB       0        36           server:15080 client:32979    users:(("socat",pid=17047,fd=5)) timer:(on,226ms,0) uid:1000 ino:90917 sk:23 <->
	 skmem:(r0,rb369280,t0,tb1170432,f1792,w2304,o0,bl0,d3) sack reno
	 wscale:7,7 rto:230 rtt:29.413/5.345 ato:64 mss:1386 pmtu:1500
	 rcvmss:1248 advmss:1460 cwnd:4 ssthresh:3 bytes_acked:17391762
	 bytes_received:62397 segs_out:13964 segs_in:8642
	 data_segs_out:13895 data_segs_in:1494 send 1.5Mbps lastsnd:4
	 lastrcv:5 lastack:5 pacing_rate 1.8Mbps delivery_rate 1.2Mbps
	 busy:56718ms unacked:1 retrans:0/11 rcv_rtt:9112.95 rcv_space:29233
	 rcv_ssthresh:41680 minrtt:25.95

working (no iptables)
ESTAB       0        0            server:61460 client:20468    users:(("socat",pid=17880,fd=5)) uid:1000 ino:129982 sk:6f <->
	 skmem:(r0,rb369280,t0,tb1852416,f0,w0,o0,bl0,d1) ts sack reno
	 wscale:7,7 rto:244 rtt:43.752/7.726 ato:40 mss:1374 pmtu:1500
	 rcvmss:1248 advmss:1448 cwnd:10 bytes_acked:2617302
	 bytes_received:5441 segs_out:1929 segs_in:976 data_segs_out:1919
	 data_segs_in:41 send 2.5Mbps lastsnd:2734 lastrcv:2734 lastack:2705
	 pacing_rate 5.0Mbps delivery_rate 12.7Mbps busy:1884ms rcv_rtt:30
	 rcv_space:29200 rcv_ssthresh:39184 minrtt:26.156

> Really it looks like at some point, all incoming packets are shown by
> tcpdump but do not reach the TCP socket anymore.
> 
> (segs_in: might be steady, look at the d0 counter shown by ss -temoi
> (dX : drop counters, sk->sk_drops)

segs_in does not increase with replays; the d1 seems stable.


> While running your experiment, try on the server.
> 
> perf record -a -g -e skb:kfree_skb  sleep 30
> perf report

While I understand what that should do, I am not sure why I do not get
any graph so that doesn't help tell what called kfree_skb and thus what
decided to drop the packet (although we no longer really need that
now..)
perf script just shows kfree_skb e.g.
swapper     0 [001] 237244.869321: skb:kfree_skb: skbaddr=0xffff8800360fda00 protocol=2048 location=0xffffffff817a1a77
                  9458e3 kfree_skb 				  (/usr/lib/debug/lib/modules/4.16.0-300.fc28.x86_64/vmlinux)


---

So I guess that ultimately the problem is why conntrack suddenly decides
that an established connection suddenly isn't anymore, despite being
listed as established by ss..
I'm discovering `conntrack(8)`, but what strikes me as interesting is
that even that points at the connection being established (looking at a
new connection after iptables started dropping packets)
# conntrack -L | grep 21308
tcp      6 267 ESTABLISHED src=server dst=client sport=21308 dport=37552 src=client dst=server sport=37552 dport=21308 [ASSURED] mark=0 use=1

compared to another that isn't dropped (the old connection without
tcp_timestamp)
tcp      6 299 ESTABLISHED src=server dst=client sport=15080 dport=32979 src=client dst=server sport=32979 dport=15080 [ASSURED] mark=0 use=1

The expect/dying/unconfirmed tables all are empty.

. . . Oh, there is something interesting there, the connection doesn't
come up with -G?
working:
conntrack -G --protonum tcp --src server --dst client --sport 15080 --dport 32979
tcp      6 299 ESTABLISHED src=server dst=client sport=15080 dport=32979 src=client dst=server sport=32979 dport=15080 [ASSURED] mark=0 use=3
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.

hang:
# conntrack -G --protonum tcp --src server --dst client --sport 21308 --dport 37552
conntrack v1.4.4 (conntrack-tools): 0 flow entries have been shown.

So something happened that makes it show up in -L (table dump) but not
when querying...?
And only when there is enough traffic: I have previously kept such a
connection without workaround for hours just fine as long as I made sure
not to display more than a screen at a time.




Thanks again,
-- 
Dominique Martinet | Asmadeus