[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170926131011.GB26395@castle.DHCP.thefacebook.com>
Date: Tue, 26 Sep 2017 14:10:11 +0100
From: Roman Gushchin <guro@...com>
To: Yuchung Cheng <ycheng@...gle.com>
CC: Oleksandr Natalenko <oleksandr@...alenko.name>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
netdev <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [REGRESSION] Warning in tcp_fastretrans_alert() of
net/ipv4/tcp_input.c
> On Wed, Sep 20, 2017 at 6:46 PM, Roman Gushchin <guro@...com> wrote:
> >
> > > Hello.
> > >
> > > Since, IIRC, v4.11, there is some regression in TCP stack resulting in the
> > > warning shown below. Most of the time it is harmless, but rarely it just
> > > causes either freeze or (I believe, this is related too) panic in
> > > tcp_sacktag_walk() (because sk_buff passed to this function is NULL).
> > > Unfortunately, I still do not have proper stacktrace from panic, but will try
> > > to capture it if possible.
> > >
> > > Also, I have custom settings regarding TCP stack, shown below as well. ifb is
> > > used to shape traffic with tc.
> > >
> > > Please note this regression was already reported as BZ [1] and as a letter to
> > > ML [2], but got neither attention nor resolution. It is reproducible for (not
> > > only) me on my home router since v4.11 till v4.13.1 incl.
> > >
> > > Please advise on how to deal with it. I'll provide any additional info if
> > > necessary, also ready to test patches if any.
> > >
> > > Thanks.
> > >
> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=195835
> > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg436158.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=MDDRfLG5DvdOeniMpaZDJI8ulKQ6PQ6OX_1YtRsiTMA&s=-n3dGZw-pQ95kMBUfq5G9nYZFcuWtbTDlYFkcvQPoKc&e=
> >
> > We're experiencing the same problems on some machines in our fleet.
> > Exactly the same symptoms: tcp_fastretrans_alert() warnings and
> > sometimes panics in tcp_sacktag_walk().
> >
> > Here is an example of a backtrace with the panic log:
Hi Yuchung!
> do you still see the panics if you disable RACK?
> sysctl net.ipv4.tcp_recovery=0?
No, we haven't seen any crash since that.
>
> also have you experience any sack reneg? could you post the output of
> ' nstat |grep -i TCP' thanks
hostname TcpActiveOpens 2289680 0.0
hostname TcpPassiveOpens 3592758 0.0
hostname TcpAttemptFails 746910 0.0
hostname TcpEstabResets 154988 0.0
hostname TcpInSegs 16258678255 0.0
hostname TcpOutSegs 46967011611 0.0
hostname TcpRetransSegs 13724310 0.0
hostname TcpInErrs 2 0.0
hostname TcpOutRsts 9418798 0.0
hostname TcpExtEmbryonicRsts 2303 0.0
hostname TcpExtPruneCalled 90192 0.0
hostname TcpExtOfoPruned 57274 0.0
hostname TcpExtOutOfWindowIcmps 3 0.0
hostname TcpExtTW 1164705 0.0
hostname TcpExtTWRecycled 2 0.0
hostname TcpExtPAWSEstab 159 0.0
hostname TcpExtDelayedACKs 209207209 0.0
hostname TcpExtDelayedACKLocked 508571 0.0
hostname TcpExtDelayedACKLost 1713248 0.0
hostname TcpExtListenOverflows 625 0.0
hostname TcpExtListenDrops 625 0.0
hostname TcpExtTCPHPHits 9341188489 0.0
hostname TcpExtTCPPureAcks 1434646465 0.0
hostname TcpExtTCPHPAcks 5733614672 0.0
hostname TcpExtTCPSackRecovery 3261698 0.0
hostname TcpExtTCPSACKReneging 12203 0.0
hostname TcpExtTCPSACKReorder 433189 0.0
hostname TcpExtTCPTSReorder 22694 0.0
hostname TcpExtTCPFullUndo 45092 0.0
hostname TcpExtTCPPartialUndo 22016 0.0
hostname TcpExtTCPLossUndo 2150040 0.0
hostname TcpExtTCPLostRetransmit 60119 0.0
hostname TcpExtTCPSackFailures 2626782 0.0
hostname TcpExtTCPLossFailures 182999 0.0
hostname TcpExtTCPFastRetrans 4334275 0.0
hostname TcpExtTCPSlowStartRetrans 3453348 0.0
hostname TcpExtTCPTimeouts 1070997 0.0
hostname TcpExtTCPLossProbes 2633545 0.0
hostname TcpExtTCPLossProbeRecovery 941647 0.0
hostname TcpExtTCPSackRecoveryFail 336302 0.0
hostname TcpExtTCPRcvCollapsed 461354 0.0
hostname TcpExtTCPAbortOnData 349196 0.0
hostname TcpExtTCPAbortOnClose 3395 0.0
hostname TcpExtTCPAbortOnTimeout 51201 0.0
hostname TcpExtTCPMemoryPressures 2 0.0
hostname TcpExtTCPSpuriousRTOs 2120503 0.0
hostname TcpExtTCPSackShifted 2613736 0.0
hostname TcpExtTCPSackMerged 21358743 0.0
hostname TcpExtTCPSackShiftFallback 8769387 0.0
hostname TcpExtTCPBacklogDrop 5 0.0
hostname TcpExtTCPRetransFail 843 0.0
hostname TcpExtTCPRcvCoalesce 949068035 0.0
hostname TcpExtTCPOFOQueue 470118 0.0
hostname TcpExtTCPOFODrop 9915 0.0
hostname TcpExtTCPOFOMerge 9 0.0
hostname TcpExtTCPChallengeACK 90 0.0
hostname TcpExtTCPSYNChallenge 3 0.0
hostname TcpExtTCPFastOpenActive 2089 0.0
hostname TcpExtTCPSpuriousRtxHostQueues 896596 0.0
hostname TcpExtTCPAutoCorking 547386735 0.0
hostname TcpExtTCPFromZeroWindowAdv 28757 0.0
hostname TcpExtTCPToZeroWindowAdv 28761 0.0
hostname TcpExtTCPWantZeroWindowAdv 322431 0.0
hostname TcpExtTCPSynRetrans 3026 0.0
hostname TcpExtTCPOrigDataSent 40976870977 0.0
hostname TcpExtTCPHystartTrainDetect 453920 0.0
hostname TcpExtTCPHystartTrainCwnd 11586273 0.0
hostname TcpExtTCPHystartDelayDetect 10943 0.0
hostname TcpExtTCPHystartDelayCwnd 763554 0.0
hostname TcpExtTCPACKSkippedPAWS 30 0.0
hostname TcpExtTCPACKSkippedSeq 218 0.0
hostname TcpExtTCPWinProbe 2408 0.0
hostname TcpExtTCPKeepAlive 213768 0.0
hostname TcpExtTCPMTUPFail 69 0.0
hostname TcpExtTCPMTUPSuccess 8811 0.0
Thanks!
Powered by blists - more mailing lists