lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2849a9af-999d-822a-9e65-b3e1d3e67f42@fnal.gov>
Date:   Tue, 25 Feb 2020 09:08:49 -0600
From:   Ron Rechenmacher <ron@...l.gov>
To:     netdev@...r.kernel.org
Subject: retransmissions/out-of-order packets with large (~1 MB) writes/reads
 w/ TCP (SOCK_STREAM) over localhost loopback

I'm seeing this and have been googling for days to try to determine if I should be
surprised by this (which I am) or if anyone else is seeing it, but I haven't found any answers.
Apologies if I'm just missing something. Someone mentioned a loopback specification,
but I can't find it.

Here's one thing that I'm doing (tried on several modern kernel, including 5.5.4):

sudo tcpdump -s78 -wt.tcpdump -ilo port 7001 & tcpdump_pid=$!; \
taskset -c 1 ./tcp_loopback.py -s -b1048576 & sleep .5; taskset -c 2 ./tcp_loopback.py -c -b1048576 --count=8192; \
sudo kill $tcpdump_pid; \
tshark -r t.tcpdump | grep -i retrans

The tcp_loopback.py script is available at home.fnal.gov/~ron/tcp_loopback.py

The heart of the server portion (-s) is:

sock = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
     sock.setsockopt( socket.SOL_SOCKET, socket.SO_REUSEADDR, 1 )
     sock.bind( ('127.0.0.1', port) )
     sock.listen( 4 )
     sockconn,address = sock.accept()
     while 1:
         data = sockconn.recv(bs)
         if opargs['-v']=='': print('received: '+str(len(data)))
         if len(data) == 0:
             if opargs['-v']=='': print('0 data, closing')
             break

The heart of the client portion (-s) is:

     sock = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
     sock.connect( ('127.0.0.1',port) )
     for xx in range(cnt): sock.send( '*'*bs )

The probability of retrans seems to increase with larger (i.e. 2M, 4M) writes/reads.

I've read (e.g. Documentation/networking/scaling.rst) about out-of-order issues related
to scheduling on different cores, hence the use of taskset above.

Is there a way to prevent this from happening (while still using large writes/reads at
high rate)?

With loopback, don't really know if I'm looking at the send processing sending things
out-of-order or the receive processing receiving things out-of-order. My ultimate goal is to
establish a baseline for low-latency inter-node transmission in a 100 Gi, high congestion
(many-to-one) environment. I developed an application which uses the "debug socket" to get
retransmission information and I was surprised to see retransmissions on localhost.

Can anyone please help me understand what's happening and if there are any knobs to turn to
eliminate retransmission while still maximizing data rate?

Thanks,
Ron

Example output:

/home/ron/notes
ron@...lap77 :^) sudo tcpdump -s78 -wt.tcpdump -ilo port 7001 & tcpdump_pid=$!; \
 > taskset -c 1 ./tcp_loopback.py -s -b1048576 & sleep .5; taskset -c 2 ./tcp_loopback.py -c -b1048576 --count=8192; \
 > sudo kill $tcpdump_pid; \
 > tshark -r t.tcpdump | grep -i retrans
[1] 31571
[2] 31572
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 78 bytes
207707 packets captured
415440 packets received by filter
0 packets dropped by kernel
[2]+  Done                    taskset -c 1 ./tcp_loopback.py -s -b1048576
77373   0.442572    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3201826393 Ack=1 Win=65536 Len=65483 TSval=2684544027 TSecr=2684544026
77374   0.442574    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3201891876 Ack=1 Win=65536 Len=65483 TSval=2684544027 TSecr=2684544026
77375   0.442576    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3201957359 Ack=1 Win=65536 Len=65483 TSval=2684544027 TSecr=2684544026
79452   0.454359    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Spurious Retransmission] 44842 → 7001 [ACK] Seq=3313696610 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
79453   0.454362    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3313893059 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
79454   0.454365    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3313958542 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
79455   0.454367    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3314024025 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
79456   0.454370    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3314089508 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
79457   0.454373    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Retransmission] 44842 → 7001 [ACK] Seq=3314154991 Ack=1 Win=65536 Len=65483 TSval=2684544038 TSecr=2684544038
[1]+  Done                    sudo tcpdump -s78 -wt.tcpdump -ilo port 7001
--2020-02-25_09:02:16--


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ