[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180417035201.GA10611@nautica>
Date: Tue, 17 Apr 2018 12:52:01 +0900
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Florian Westphal <fw@...len.de>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Michal Kubecek <mkubecek@...e.cz>, netdev@...r.kernel.org
Subject: Re: tcp hang when socket fills up ?
Thank you for the hints. Out of order reply.
Florian Westphal wrote on Mon, Apr 16, 2018:
> echo 1 > /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal
>
> which stops conntrack from marking packets with out-of-window
> acks as invalid.
That worked, the connection stays established with this set.
> echo 6 > /proc/sys/net/netfilter/nf_conntrack_log_invalid
Took me a while to figure how to log that properly (why do distros not
ship ulogd?!) but that is helpful.
I have attached the full logs to this mail as text attachments in case I
missed something in my excerpts
Specifically, there are many outbound invalid packets, but as Marcelo
pointed out there is no conntrack netfilter for outbound packets...
The first invalid incoming packet is that one (json output seemed to be
the most convenient as it has tcp seq/ackseq and the oob.prefix message,
feel free to ask for some other format)
{"timestamp": "2018-04-17T10:29:14.956485", "dvc": "Netfilter",
"raw.pktlen": 52, "raw.pktcount": 1, "oob.prefix": "nf_ct_tcp: ACK is
over the upper bound (ACKed data not seen yet) ", "oob.time.sec":
1523928554, "oob.time.usec": 956485, "oob.mark": 0, "oob.hook": 0,
"oob.family": 2, "oob.protocol": 2048, "raw.label": 0, "ip.protocol": 6,
"ip.tos": 0, "ip.ttl": 52, "ip.totlen": 52, "ip.ihl": 5, "ip.csum":
50909, "ip.id": 30328, "ip.fragoff": 16384, "src_port": 30280,
"dest_port": 29543, "tcp.seq": 4048786673, "tcp.ackseq": 3024392506,
"tcp.window": 781, "tcp.offset": 0, "tcp.reserved": 0, "tcp.urg": 0,
"tcp.ack": 1, "tcp.psh": 0, "tcp.rst": 0, "tcp.syn": 0, "tcp.fin": 0,
"tcp.res1": 0, "tcp.res2": 0, "tcp.csum": 17752, "oob.in": "",
"oob.out": "", "src_ip": "client", "dest_ip": "server"}
which is the first ack that isn't seen alright (sent packet, blocked
ack, replays):
10:29:14.926115 IP server.29543 > client.30280: Flags [.], seq 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val 681965809 ecr 1634528339], length 1374
...
10:29:14.956485 IP client.30280 > server.29543: Flags [.], ack 3024392506, win 781, options [nop,nop,TS val 1634528369 ecr 681965809], length 0
...
10:29:15.255489 IP server.29543 > client.30280: Flags [.], seq 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val 681966138 ecr 1634528369], length 1374
10:29:15.281581 IP client.30280 > server.29543: Flags [.], ack 3024436474, win 1444, options [nop,nop,TS val 1634528697 ecr 681965904,nop,nop,sack 1 {3024391132:3024392506}], length 0
10:29:15.555466 IP client.30280 > server.29543: Flags [P.], seq 4048786673:4048786709, ack 3024436474, win 1444, options [nop,nop,TS val 1634528869 ecr 681965904], length 36
10:29:15.719510 IP server.29543 > client.30280: Flags [.], seq 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val 681966602 ecr 1634528369], length 1374
So now I guess the question is why netfilter think this is over the
upper bound.
..And surely enough the answer is within the invalid outbound packets
that aren't considered in the conntrack state machine, the first of whih
is this one:
{"timestamp": "2018-04-17T10:29:14.926143", "dvc": "Netfilter",
"raw.pktlen": 2800, "raw.pktcount": 1, "oob.prefix": "nf_ct_tcp: SEQ is
over the upper bound (over the window of the receiver)", "oob.time.sec":
1523928554, "oob.time.usec": 926143, "oob.mark": 0, "oob.hook": 0,
"oob.family": 2, "oob.protocol": 2048, "oob.uid": 1000, "oob.gid": 100,
"raw.label": 0, "ip.protocol": 6, "ip.tos": 8, "ip.ttl": 64,
"ip.totlen": 2800, "ip.ihl": 5, "ip.csum": 47600, "ip.id": 27809,
"ip.fragoff": 16384, "src_port": 29543, "dest_port": 30280, "tcp.seq":
3024391132, "tcp.ackseq": 4048786673, "tcp.window": 307, "tcp.offset":
0, "tcp.reserved": 0, "tcp.urg": 0, "tcp.ack": 1, "tcp.psh": 0,
"tcp.rst": 0, "tcp.syn": 0, "tcp.fin": 0, "tcp.res1": 0, "tcp.res2": 0,
"tcp.csum": 5201, "oob.in": "", "oob.out": "", "src_ip": "server",
"dest_ip": "client"}
which sequence matches the start seq of the first packet we don't ack,
so that's why conntrack doesn't recongnize that as a valid ack.
Here's some context before that packet:
10:29:14.921305 IP server.29543 > client.30280: Flags [.], seq 3024378314:3024379688, ack 4048786673, win 307, options [nop,nop,TS val 681965804 ecr 1634528336], length 1374
10:29:14.921311 IP server.29543 > client.30280: Flags [.], seq 3024379688:3024381062, ack 4048786673, win 307, options [nop,nop,TS val 681965804 ecr 1634528336], length 1374
10:29:14.922166 IP client.30280 > server.29543: Flags [.], ack 3024365948, win 329, options [nop,nop,TS val 1634528336 ecr 681965778], length 0
10:29:14.922178 IP server.29543 > client.30280: Flags [.], seq 3024381062:3024382436, ack 4048786673, win 307, options [nop,nop,TS val 681965805 ecr 1634528336], length 1374
10:29:14.922182 IP server.29543 > client.30280: Flags [.], seq 3024382436:3024383810, ack 4048786673, win 307, options [nop,nop,TS val 681965805 ecr 1634528336], length 1374
10:29:14.923848 IP client.30280 > server.29543: Flags [.], ack 3024367322, win 352, options [nop,nop,TS val 1634528338 ecr 681965778], length 0
10:29:14.923860 IP server.29543 > client.30280: Flags [.], seq 3024383810:3024385184, ack 4048786673, win 307, options [nop,nop,TS val 681965807 ecr 1634528338], length 1374
10:29:14.923863 IP server.29543 > client.30280: Flags [.], seq 3024385184:3024386558, ack 4048786673, win 307, options [nop,nop,TS val 681965807 ecr 1634528338], length 1374
10:29:14.924175 IP client.30280 > server.29543: Flags [.], ack 3024368696, win 375, options [nop,nop,TS val 1634528338 ecr 681965778], length 0
10:29:14.924187 IP server.29543 > client.30280: Flags [.], seq 3024386558:3024387932, ack 4048786673, win 307, options [nop,nop,TS val 681965807 ecr 1634528338], length 1374
10:29:14.924190 IP server.29543 > client.30280: Flags [.], seq 3024387932:3024389306, ack 4048786673, win 307, options [nop,nop,TS val 681965807 ecr 1634528338], length 1374
10:29:14.925410 IP client.30280 > server.29543: Flags [.], ack 3024370070, win 397, options [nop,nop,TS val 1634528339 ecr 681965779], length 0
10:29:14.925422 IP server.29543 > client.30280: Flags [.], seq 3024389306:3024390680, ack 4048786673, win 307, options [nop,nop,TS val 681965808 ecr 1634528339], length 1374
10:29:14.925425 IP server.29543 > client.30280: Flags [P.], seq 3024390680:3024391132, ack 4048786673, win 307, options [nop,nop,TS val 681965808 ecr 1634528339], length 452
10:29:14.926030 IP client.30280 > server.29543: Flags [.], ack 3024371444, win 420, options [nop,nop,TS val 1634528339 ecr 681965779], length 0
10:29:14.926115 IP server.29543 > client.30280: Flags [.], seq 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val 681965809 ecr 1634528339], length 1374
But the way I see it, (3024392506-3024371444) = 21064 is smaller than
420*64 = 26880.
Even looking at the previous (ack 3024370070, win 397) we get 22436 and
25408 which still work, same with ack 3024368696, win 375 (23810 <
24000)
So that means conntrack is looking at an ack at least as old as ack
3024367322, win 352?...
Is there any way to confirm that? In the code it looks like we match
! before(seq, sender->td_maxend + 1)
I turned pr_debug on in tcp_in_window() for another try and it's a bit
mangled because the information on multiple lines and the function is
called in parallel but it looks like I do have some seq > maxend +1
Although it's weird, the maxend was set WAY earlier apparently?
Apr 17 11:13:14 res=1 sender end=1913287798 maxend=1913316998 maxwin=29312 receiver end=505004284 maxend=505033596 maxwin=29200
then window decreased drastically e.g. previous ack just before refusal:
Apr 17 11:13:53 seq=1913292311 ack=505007789+(0) sack=505007789+(0) win=284 end=1913292311
Apr 17 11:13:53 sender end=1913292311 maxend=1913331607 maxwin=284 scale=0 receiver end=505020155 maxend=505033596 maxwin=39296 scale=7
first res=0 (refused) packet:
Apr 17 11:13:53 odin kernel: seq=505042139 ack=1913292311+(0) ack=1913292311+(0) win=307 end=505044887
Apr 17 11:13:53 res=0 sender end=505033895 maxend=505033596 maxwin=39296 receiver end=1913292311 maxend=1913331607 maxwin=674
> (Earlier email implies this is related to timestamps, but unfortunately
> to best of my knowledge conntrack doesn't look at tcp timestamps).
tcp_timestamp introduces other changes in behaviour so it might still be
relateed.
Recap of attachments:
- new tcpdump output of a failed attempt
- matching invalid packet dumped in json format
- logs of tcp_in_window function unfortunately of a different attempt
Happy to provide new/different logs if it would be useful. Once again
thanks for the extra eyes and useful commands.
--
Dominique Martinet | Asmadeus
View attachment "tcpdump" of type "text/plain" (27334 bytes)
Download attachment "ulogd.json" of type "application/json" (48652 bytes)
Download attachment "dmesg.xz" of type "application/octet-stream" (500772 bytes)
Powered by blists - more mailing lists