[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20240302091110.3e18088c@hermes.local>
Date: Sat, 2 Mar 2024 09:11:10 -0800
From: Stephen Hemminger <stephen@...workplumber.org>
To: netdev@...r.kernel.org
Subject: Fw: [Bug 218552] New: GRE passing Linux MPLS network has poor
performance for TCP
Begin forwarded message:
Date: Sat, 02 Mar 2024 15:33:49 +0000
From: bugzilla-daemon@...nel.org
To: stephen@...workplumber.org
Subject: [Bug 218552] New: GRE passing Linux MPLS network has poor performance for TCP
https://bugzilla.kernel.org/show_bug.cgi?id=218552
Bug ID: 218552
Summary: GRE passing Linux MPLS network has poor performance
for TCP
Product: Networking
Version: 2.5
Hardware: Intel
OS: Linux
Status: NEW
Severity: high
Priority: P3
Component: Other
Assignee: stephen@...workplumber.org
Reporter: devel@...ynet.dev
Regression: No
Created attachment 305949
--> https://bugzilla.kernel.org/attachment.cgi?id=305949&action=edit
GRE over MPLS poor performance
I'm facing a strange behavior of the MPLS network between 2 routers build on
linux. Then I'm creating a GRE tunnel on a router or a GRE tunnel which is
passing the Linux MPLS network I have a very-very poor performance of the TCP
traffic, even I'm shrinking the MTU or the MSS.
The setup is like this:
Inbound traffic:
ISP -> (eth3-0) R02 (eth4-0) -> (MPLS) -> (eth4-0) R01 (eth3-1 & eth4-1) -> VPN
server
Outbound traffic
VPN server -> (eth3-1 & eth4-1) R01 (eth4-0) -> (MPLS) -> (eth4-0) R02 (eth3-0)
-> ISP
Routing table on R02:
R02# show ip route vrf internet 89.A.B.1
Routing entry for 89.A.B.1/32
Known via "bgp", distance 200, metric 0, vrf internet, best
Last update 12:43:11 ago
10.100.1.1(vrf default) (recursive), label 81, weight 1
* 10.100.0.1, via mpls0(vrf default), label IPv4 Explicit Null/81, weight 1
R02# show ip route vrf servers 89.A.B.161
Routing entry for 89.A.B.128/26
Known via "bgp", distance 200, metric 0, vrf servers, best
Last update 12:40:35 ago
10.100.1.1(vrf default) (recursive), label 85, weight 1
* 10.100.0.1, via mpls0(vrf default), label IPv4 Explicit Null/85, weight 1
R02# show ip route vrf internet 89.A.B.161
Routing entry for 89.A.B.128/26
Known via "bgp", distance 200, metric 0, vrf internet, best
Last update 12:42:56 ago
10.100.1.1(vrf default) (recursive), label 85, weight 1
* 10.100.0.1, via mpls0(vrf default), label IPv4 Explicit Null/85, weight 1
R02# show ip route vrf internet 178.C.D.0/15
Routing entry for 178.C.D.0/15
Known via "bgp", distance 20, metric 0, vrf internet, best
Last update 14:28:23 ago
193.230.200.47 (recursive), weight 1
* 89.238.245.113, via wan0.650, weight 1
R02# show ip route vrf servers
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF servers:
S>* 0.0.0.0/0 [1/0] is directly connected, internet (vrf internet), weight 1,
14:29:46
Routing table on R01:
R01# show ip route vrf internet 89.A.B.1
Routing entry for 89.A.B.1/32
Known via "local", distance 0, metric 0, vrf internet
Last update 15:07:48 ago
* directly connected, internet
Routing entry for 89.A.B.1/32
Known via "connected", distance 0, metric 0, vrf internet, best
Last update 15:07:48 ago
* directly connected, internet
R01# show ip route vrf servers 89.A.B.161
Routing entry for 89.A.B.128/26
Known via "connected", distance 0, metric 0, vrf servers, best
Last update 14:53:40 ago
* directly connected, lan0.11
R01# show ip route vrf internet 89.A.B.161
Routing entry for 89.A.B.128/26
Known via "bgp", distance 20, metric 0, vrf internet, best
Last update 14:53:50 ago
* directly connected, servers(vrf servers), weight 1
R01# show ip route vrf internet 178.C.D.0/15
Routing entry for 178.C.D.0/15
Known via "bgp", distance 200, metric 0, vrf internet, best
Last update 12:44:27 ago
10.100.2.1(vrf default) (recursive), label 81, weight 1
* 10.100.0.2, via eth4-0(vrf default), label IPv4 Explicit Null/81, weight
1
Create a GRE tunnel:
R01# /sbin/ip link add name gre1001 numtxqueues $(nproc) numrxqueues $(nproc)
type gre remote 178.C.D.X local 89.A.B.1 ttl 225 key 1001
R01# ip link set gre1001 up
R10# /sbin/ip link add name gre1001 numtxqueues $(nproc) numrxqueues $(nproc)
type gre remote 89.A.B.1 local 178.C.D.X ttl 225 key 1001
R10# ip link set gre1001 up
R01# show interface gre1001
Interface gre1001 is up, line protocol is up
Link ups: 6 last: 2024/03/02 16:50:46.82
Link downs: 6 last: 2024/03/02 16:50:46.82
vrf: default
Description: R01-R10 GRE
index 206 metric 0 mtu 65507 speed 0 txqlen 1000
flags: <UP,POINTOPOINT,RUNNING,NOARP>
Ignore all v4 routes with linkdown
Ignore all v6 routes with linkdown
Type: GRE over IP
HWaddr: 59:26:3a:01
inet 10.100.100.129/30
inet6 fe80::5926:3a01/64
Interface Type GRE
Interface Slave Type None
VTEP IP: 89.A.B.1 , remote 178.C.D.X
protodown: off
R10# show interface gre1001
Interface gre1001 is up, line protocol is up
Link ups: 38 last: 2024/03/02 16:51:35.67
Link downs: 30 last: 2024/03/02 16:51:35.66
vrf: default
Description: R01-R10 GRE
index 357 metric 0 mtu 1472 speed 0 txqlen 1000
flags: <UP,POINTOPOINT,RUNNING,NOARP>
Type: GRE over IP
HWaddr: b2:26:6c:bc
inet 10.100.100.130/30
inet6 fe80::b226:6cbc/64
Interface Type GRE
Interface Slave Type None
VTEP IP: 178.C.D.X , remote 89.A.B.1
protodown: off
Testing:
R10# iperf3 -c 10.100.100.129
Connecting to host 10.100.100.129, port 5201
[ 5] local 10.100.100.130 port 51610 connected to 10.100.100.129 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 2.38 MBytes 19.9 Mbits/sec 20 2.77 KBytes
[ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 8 2.77 KBytes
[ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 8 2.77 KBytes
[ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 8 2.77 KBytes
[ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 8 4.16 KBytes
[ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 8 5.55 KBytes
[ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 12 2.77 KBytes
[ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 8 2.77 KBytes
[ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 12 2.77 KBytes
[ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 10 2.77 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.38 MBytes 2.00 Mbits/sec 102 sender
[ 5] 0.00-10.04 sec 128 KBytes 104 Kbits/sec receiver
R10# iperf3 -c 10.100.100.129 -R
Connecting to host 10.100.100.129, port 5201
Reverse mode, remote host 10.100.100.129 is sending
[ 5] local 10.100.100.130 port 50280 connected to 10.100.100.129 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.03 sec 47.4 MBytes 386 Mbits/sec
[ 5] 1.03-2.02 sec 30.9 MBytes 261 Mbits/sec
[ 5] 2.02-3.00 sec 25.8 MBytes 220 Mbits/sec
[ 5] 3.00-4.01 sec 27.0 MBytes 224 Mbits/sec
[ 5] 4.01-5.01 sec 28.4 MBytes 238 Mbits/sec
[ 5] 5.01-6.00 sec 28.0 MBytes 238 Mbits/sec
[ 5] 6.00-7.01 sec 28.1 MBytes 235 Mbits/sec
[ 5] 7.01-8.00 sec 28.8 MBytes 242 Mbits/sec
[ 5] 8.00-9.03 sec 29.0 MBytes 237 Mbits/sec
[ 5] 9.03-10.01 sec 28.2 MBytes 242 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.05 sec 305 MBytes 255 Mbits/sec 8 sender
[ 5] 0.00-10.01 sec 302 MBytes 253 Mbits/sec receiver
Even in TCPDUMP over the MPLS network I'm capturing very low number of
packates:
I test it from VPN server to some Cisco routers and each time when the GRE
tunnel is passing Linux MPLS network I have such huge TCP degradation. If I'm
moving the tunnels to GUE, FOU or IPIP the performances are over 250Mbits/s.
R10# ip a l gre1001
358: gre1001@...E: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state
UNKNOWN group default qlen 1000
link/ipip 178.C.D.X peer 89.A.B.1
inet 10.100.100.130/30 brd 10.100.100.131 scope global gre1001
valid_lft forever preferred_lft forever
inet6 fe80::200:5efe:b226:6cbc/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
R01# ip a l gre1001
207: gre1001@...E: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state
UNKNOWN group default qlen 1000
link/ipip 89.A.B.1 peer 178.C.D.X
inet 10.100.100.129/30 brd 10.100.100.131 scope global gre1001
valid_lft forever preferred_lft forever
inet6 fe80::200:5efe:5926:3a01/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
root@R10:~# iperf3 -c 10.100.100.129 -R
Connecting to host 10.100.100.129, port 5201
Reverse mode, remote host 10.100.100.129 is sending
[ 5] local 10.100.100.130 port 42162 connected to 10.100.100.129 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.02 sec 48.0 MBytes 395 Mbits/sec
[ 5] 1.02-2.01 sec 33.4 MBytes 283 Mbits/sec
[ 5] 2.01-3.01 sec 35.0 MBytes 292 Mbits/sec
[ 5] 3.01-4.01 sec 36.6 MBytes 307 Mbits/sec
[ 5] 4.01-5.01 sec 37.6 MBytes 317 Mbits/sec
[ 5] 5.01-6.00 sec 38.4 MBytes 322 Mbits/sec
[ 5] 6.00-7.00 sec 38.2 MBytes 321 Mbits/sec
[ 5] 7.00-8.00 sec 38.5 MBytes 323 Mbits/sec
[ 5] 8.00-9.01 sec 39.1 MBytes 327 Mbits/sec
[ 5] 9.01-10.01 sec 38.9 MBytes 327 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.05 sec 388 MBytes 324 Mbits/sec 12 sender
[ 5] 0.00-10.01 sec 384 MBytes 322 Mbits/sec receiver
iperf Done.
root@R10:~# iperf3 -c 10.100.100.129
Connecting to host 10.100.100.129, port 5201
[ 5] local 10.100.100.130 port 43416 connected to 10.100.100.129 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 41.1 MBytes 345 Mbits/sec 0 3.75 MBytes
[ 5] 1.00-2.00 sec 46.2 MBytes 388 Mbits/sec 5 1.34 MBytes
[ 5] 2.00-3.00 sec 36.2 MBytes 304 Mbits/sec 0 1.42 MBytes
[ 5] 3.00-4.00 sec 36.2 MBytes 304 Mbits/sec 0 1.48 MBytes
[ 5] 4.00-5.00 sec 38.8 MBytes 325 Mbits/sec 0 1.52 MBytes
[ 5] 5.00-6.00 sec 40.0 MBytes 335 Mbits/sec 0 1.55 MBytes
[ 5] 6.00-7.00 sec 38.8 MBytes 325 Mbits/sec 0 1.57 MBytes
[ 5] 7.00-8.00 sec 40.0 MBytes 336 Mbits/sec 0 1.57 MBytes
[ 5] 8.00-9.00 sec 40.0 MBytes 336 Mbits/sec 0 1.57 MBytes
[ 5] 9.00-10.00 sec 40.0 MBytes 335 Mbits/sec 0 1.57 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 397 MBytes 333 Mbits/sec 5 sender
[ 5] 0.00-10.04 sec 396 MBytes 330 Mbits/sec receiver
iperf Done.
Routers NICs are 40GbE Mellanox MCX354A-FCBT. MPLS interfaces has MTU 9216.
The solution was to move all tunnels between VPN server and Cisco routers to
IPIP.
Does anybody faced such issue? I have no clue what to optimize or is a Kernel
bug.
Tested on 6.5.x an 6.6.x Kernels.
I've attached a small capture of traffic from the test with poor performance.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
Powered by blists - more mailing lists