[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180416014740.GA12245@nautica>
Date: Mon, 16 Apr 2018 10:47:40 +0900
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Michal Kubecek <mkubecek@...e.cz>, netdev@...r.kernel.org
Subject: Re: tcp hang when socket fills up ?
Eric Dumazet wrote on Fri, Apr 13, 2018:
> That might be caused by some TS val/ecr breakage :
>
> Many acks were received by the server tcpdump,
> but none of them was accepted by TCP stack, for some reason.
>
> Try to disable TCP timestamps, it will give some hint if bug does not reproduce.
This was spot on, after disabling tcp timestamps I cannot reproduce the
hang anymore.
I've had another look at the original sequence (as seen by the server)
and I don't see much wrong; tell me what I missed:
- the replayed packet has seq 32004:33378, so the first ignored ack
would be the one with ack 33378, is that right? (meaning the server did
accept the one for 32004 and none after that)
Assuming it is, excerpt from around then (first emission of that packet then
client replies):
16:49:26.700531 IP <server>.13317 > <client>.31872: Flags [.], seq 32004:33378, ack 4190, win 307, options [nop,nop,TS val 1313937607 ecr 1617129440], length 1374
...
16:49:26.728084 IP <client>.31872 > <server>.13317: Flags [.], ack 32004, win 759, options [nop,nop,TS val 1617129473 ecr 1313937602], length 0
...
16:49:26.729531 IP <client>.31872 > <server>.13317: Flags [.], ack 33378, win 782, options [nop,nop,TS val 1617129475 ecr 1313937607], length 0
16:49:26.730002 IP <client>.31872 > <server>.13317: Flags [.], ack 34752, win 805, options [nop,nop,TS val 1617129475 ecr 1313937607], length 0
...
16:49:26.731634 IP <client>.31872 > <server>.13317: Flags [.], ack 36126, win 827, options [nop,nop,TS val 1617129476 ecr 1313937607], length 0
- the ecr value matches the val of the packet it acks
- the val is >= that of previous packet (won't be considered
reorder/should pass paws check?)
- even if the packets are processed in parallel and some kind of race
occurs, a "bigger" ack should ack all the previous packets, right?
- Just to make sure, I checked /proc/net/netstat for PAWSEstab but that
is 0:
TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps ArpFilter TW TWRecycled TWKilled PAWSActive PAWSEstab DelayedACKs DelayedACKLocked DelayedACKLost ListenOverflows ListenDrops TCPHPHits TCPPureAcks TCPHPAcks TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder TCPSACKReorder TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo TCPDSACKUndo TCPLossUndo TCPLostRetransmit TCPRenoFailures TCPSackFailures TCPLossFailures TCPFastRetrans TCPSlowStartRetrans TCPTimeouts TCPLossProbes TCPLossProbeRecovery TCPRenoRecoveryFail TCPSackRecoveryFail TCPRcvCollapsed TCPDSACKOldSent TCPDSACKOfoSent TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnData TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger TCPAbortFailed TCPMemoryPressures TCPMemoryPressuresChrono TCPSACKDiscard TCPDSACKIgnoredOld TCPDSACKIgnoredNoUndo TCPSpuriousRTOs TCPMD5NotFound TCPMD5Unexpected TCPMD5Failure TCPSackShifted TCPSackMerged TCPSackShiftFallback TCPBacklogDrop PFMemallocDrop TCPMinTTLDrop TCPDeferAcceptDrop IPReversePathFilter TCPTimeWaitOverflow TCPReqQFullDoCookies TCPReqQFullDrop TCPRetransFail TCPRcvCoalesce TCPOFOQueue TCPOFODrop TCPOFOMerge TCPChallengeACK TCPSYNChallenge TCPFastOpenActive TCPFastOpenActiveFail TCPFastOpenPassive TCPFastOpenPassiveFail TCPFastOpenListenOverflow TCPFastOpenCookieReqd TCPFastOpenBlackhole TCPSpuriousRtxHostQueues BusyPollRxPackets TCPAutoCorking TCPFromZeroWindowAdv TCPToZeroWindowAdv TCPWantZeroWindowAdv TCPSynRetrans TCPOrigDataSent TCPHystartTrainDetect TCPHystartTrainCwnd TCPHystartDelayDetect TCPHystartDelayCwnd TCPACKSkippedSynRecv TCPACKSkippedPAWS TCPACKSkippedSeq TCPACKSkippedFinWait2 TCPACKSkippedTimeWait TCPACKSkippedChallenge TCPWinProbe TCPKeepAlive TCPMTUPFail TCPMTUPSuccess
TcpExt: 0 0 0 0 58 0 0 26 0 0 50 0 0 0 0 75402 17 201 0 0 6876848 59804 2258387 0 33 0 0 3 0 0 0 0 1 102 15 0 0 0 1306 60 386 292 10 0 0 108750 201 1 228 1 8 4 0 63 0 3 0 0 0 0 107 1 0 0 0 2834 1962 622 0 0 0 0 0 0 0 0 0 1065022 54160 0 1 3 3 0 0 0 0 0 0 0 475 0 9578 6 8 71 257 5116325 0 0 0 0 0 0 6 0 0 0 61 85 0 0
IpExt: InNoRoutes InTruncatedPkts InMcastPkts OutMcastPkts InBcastPkts OutBcastPkts InOctets OutOctets InMcastOctets OutMcastOctets InBcastOctets OutBcastOctets InCsumErrors InNoECTPkts InECT1Pkts InECT0Pkts InCEPkts
IpExt: 0 0 0 0 206 0 16405602866 8921427728 0 0 57928 0 0 16121410 0 5388 0
Also, here are the per-socket stats I could find (ss -i after having
reproduced hang):
reno wscale:7,7 rto:7456 backoff:5 rtt:32.924/1.41 ato:40 mss:1374
pmtu:1500 rcvmss:1248 advmss:1448 cwnd:1 ssthresh:16
bytes_acked:32004 bytes_received:4189 segs_out:85 segs_in:54
data_segs_out:78 data_segs_in:18 send 333.9Kbps lastsnd:3912
lastrcv:11464 lastack:11387 pacing_rate 21.4Mbps delivery_rate
3.5Mbps busy:12188ms unacked:33 retrans:1/5 lost:33 rcv_rtt:37
rcv_space:29200 rcv_ssthresh:39184 notsent:28796 minrtt:24.986
Here are the same stats with tcp timestamp disabled (after running my
reproducer, e.g. outputing a big chunk of text quickly):
reno wscale:7,7 rto:228 rtt:27.267/1.423 ato:40 mss:1386 pmtu:1500
rcvmss:1248 advmss:1460 cwnd:10 ssthresh:12 bytes_acked:17311070
bytes_received:40445 segs_out:13331 segs_in:7523
data_segs_out:13279 data_segs_in:947 send 4.1Mbps lastsnd:5
lastrcv:5 lastack:5 pacing_rate 4.9Mbps delivery_rate 431.4Kbps
app_limited busy:36064ms unacked:1 retrans:0/6 rcv_rtt:9112.95
rcv_space:29233 rcv_ssthresh:39184 minrtt:25.566
So I guess lost:33 matching unacked:33 might be another hint?
I'll need a bit more time reading the code to understand what this all
implies ; feel free to beat me to it.
Thanks,
--
Dominique Martinet | Asmadeus
Powered by blists - more mailing lists