[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1323937469.2631.31.camel@edumazet-laptop>
Date: Thu, 15 Dec 2011 09:24:29 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Carsten Wolff <carsten@...ffcarsten.de>
Cc: Yuchung Cheng <ycheng@...gle.com>,
"Esztermann, Ansgar" <Ansgar.Esztermann@...-bpc.mpg.de>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: TCP fast retransmit
Le jeudi 15 décembre 2011 à 08:41 +0100, Carsten Wolff a écrit :
> On Wednesday 14 December 2011, Eric Dumazet wrote:
> > Le mercredi 14 décembre 2011 à 11:00 -0800, Yuchung Cheng a écrit :
> > > I use tcptrace to check the time sequence and I am puzzled:
> > > I see a lot of OOO packets too but how can this happen at a sender-side
> > > trace? unless the trace is taken close to but not exactly at the sender.
> > > I expect on seeing in-sequence packets but a lots of SACKs plus some
> > > spurious retransmists.
> >
> > I understood the trace was a receiver-side one (a linux machine if I am
> > not mistaken, while the sender is AIX powered)
> >
> > (Looking at timings of ACKS, coming a few us after corresponding data
> > packet arrival)
>
> Oh. Right. This also means, that net.ipv4.tcp_reordering is only available at
> the receiver (Linux), which doesn't help, because the reordering robustness
> stuff happens on sender-side. So don't even bother changing that sysctl.
>
Oh well, reading Ansgar mail, it seems this is the other way :
quote :
2.6.37.6 with openSUSE patches in the sender, some version of AIX in the
receiver. The latter seems to be critical: we've never encountered this
problem with any other combination of OSs but AIX & Linux.
I only dont understand how we can receive an ACK so fast (6 us after the
data packet ACKed, even 3us a bit later). This seems not possible, even
with 10Gb infra. (A CISCO firewall was mentioned)
12:18:20.732998 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 284400:287136, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 2736
12:18:20.733004 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 287136, win 591, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733048 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 287136:293976, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 6840
12:18:20.733073 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 293976, win 549, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733104 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 293976:298080, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 4104
12:18:20.733120 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 298080, win 522, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
Here next two packets we send are out of order.
12:18:20.733161 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 299448:300816, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733164 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 298080, win 522, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {299448:300816}], length 0
12:18:20.733166 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 298080:299448, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733169 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 300816:302184, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733171 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 303552:304920, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733173 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 302184, win 490, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {303552:304920}], length 0
12:18:20.733174 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 302184:303552, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733177 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 304920, win 469, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733224 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 304920:310392, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 5472
12:18:20.733228 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 311760:313128, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733230 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:313128}], length 0
12:18:20.733272 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 313128:315864, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 2736
12:18:20.733276 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:315864}], length 0
12:18:20.733326 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 315864:319968, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 4104
12:18:20.733330 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:319968}], length 0
12:18:20.733332 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 310392:311760, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733333 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 321336:322704, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733335 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 319968, win 353, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {321336:322704}], length 0
12:18:20.733372 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 322704:324072, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733375 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 319968, win 353, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {321336:324072}], length 0
12:18:20.733377 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 319968:321336, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733381 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 324072, win 327, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
Really, my feeling is this trace is taken on receiver, and maybe LRO/GRO
is buggy ?
Ansgar, please provide more details, like the NIC you use (hardware,
driver versions...)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists