lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Feb 2021 11:41:22 +0100
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: Spurious TCP retransmissions on ack vs kfree_skb reordering

On Fri, Feb 26, 2021 at 11:05 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Fri, Feb 26, 2021 at 4:15 AM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Thu, 25 Feb 2021 15:25:15 -0800 Jakub Kicinski wrote:
> > > Hi!
> > >
> > > We see large (4-8x) increase of what looks like TCP RTOs after rising
> > > the Tx coalescing above Rx coalescing timeout.
> > >
> > > Quick tracing of the events seems to indicate that the data has already
> > > been acked when we enter tcp:tcp_retransmit_skb:
> >
> > Seems like I'm pretty lost here and the tcp:tcp_retransmit_skb events
> > are less spurious than I thought. Looking at some tcpdump traces we see:
> >
> > 0.045277 IP6 A > B: Flags [SEW], seq 2248382925:2248383296, win 61920, options [mss 1440,sackOK,TS val 658870494 ecr 0,nop,wscale 11], length 371
> >
> > 0.045348 IP6 B > A: Flags [S.E], seq 961169456, ack 2248382926, win 65535, options [mss 1440,sackOK,TS val 883864022 ecr 658870494,nop,wscale 9], length 0
>
> The SYNACK does not include the prior payload.
>
> > 0.045369 IP6 A > B: Flags [P.], seq 1:372, ack 1, win 31, options [nop,nop,TS val 658870494 ecr 883864022], length 371
>
> So this rtx is not spurious.
>
> However in your prior email you wrote :
>
> bytes_in:      0
> bytes_out:   742
> bytes_acked: 742
>
> Are you sure that at the time of the retransmit, bytes_acked was 742 ?
> I do not see how this could happen.

Yes, this packetdrill test confirms TCP INFO stuff seems correct .

//
// SYN-data are not fully acknowledged
//
`../../common/defaults.sh
 sysctl -wq net.ipv4.tcp_timestamps=0 `

// Cache warmup: send a Fast Open cookie request
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
   +0 sendto(3, ..., 0, MSG_FASTOPEN, ..., ...) = -1 EINPROGRESS
(Operation is now in progress)
   +0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop>
 +.01 < S. 123:123(0) ack 1 win 5840 <mss
1460,nop,nop,sackOK,nop,wscale 6,FO 1234abcd,nop,nop>
   +0 %{ assert tcpi_delivered == 1, tcpi_delivered }%
   +0 > . 1:1(0) ack 1
 +.01 close(3) = 0
   +0 > F. 1:1(0) ack 1
 +.01 < F. 1:1(0) ack 2 win 92
   +0 > .  2:2(0) ack 2


//
// Test: the data in SYN-data are retransmitted when server acks only ISN
//
   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
   +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
   +0 sendto(4, ..., 700, MSG_FASTOPEN, ..., ...) = 700
   +0 > S 0:700(700) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO 1234abcd,nop,nop>
 +.02 < S. 0:0(0) ack 1 win 5840 <mss 1040,nop,nop,sackOK,nop,wscale
6,FO 1234abcd,nop,nop>
// Retransmits data in SYN-data on the first ACK.
   +0 %{ assert tcpi_delivered == 1, tcpi_delivered }%
   +0 %{ assert tcpi_bytes_acked == 1, tcpi_bytes_acked }%
   +0 %{ assert tcpi_bytes_sent == 2*700, tcpi_bytes_sent }%
   +0 > P. 1:701(700) ack 1
   +0 < P. 1:1001(1000) ack 701 win 257
   +0 %{ assert tcpi_delivered == 2, tcpi_delivered }%
   +0 %{ assert tcpi_bytes_acked == 701, tcpi_bytes_acked }%
   +0 > . 701:701(0) ack 1001
   +0 read(4, ..., 1024) = 1000
   +0 %{ assert (tcpi_options & TCPI_OPT_SYN_DATA) == 0, tcpi_options }%
   +0 close(4) = 0
   +0 > F. 701:701(0) ack 1001
 +.02 < F. 1001:1001(0) ack 702 win 257
   +0 > . 702:702(0) ack 1002

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ