[<prev] [next>] [day] [month] [year] [list]
Message-ID: <54D21119.6050609@opendium.com>
Date: Wed, 04 Feb 2015 12:31:21 +0000
From: Steve Hill <steve@...ndium.com>
To: netdev@...r.kernel.org
Subject: MTU problems with GRE and IPv6
I'm having some problems related to oversized IPv6 packets on a GRE
tunnel under Scientific Linux 6.6 (Kernel 2.6.32-504.3.3.el6.x86_64).
I have a set of machines set up as follows:
Client
|
Router
|
(Internet)
|
Physical KVM Host
|
Server (KVM virtual machine)
All of these machines have both IPv4 and IPv6 connectivity.
The Client<->Router connection is ethernet with a 1500 octet MTU and the
virtual NIC between the KVM host and the server also has a 1500 octet MTU.
There is a GRE-over-IPv4 tunnel between Router and Server with an MTU of
1476. On Server, normally traffic is routed via the virtual NIC, but
iptables/ip6tables sets a CONNMARK on any traffic arriving over the GRE
tunnel and that mark is used to select a different routing table for the
reply traffic so that it goes back over the GRE tunnel.
From Client, I connect to port 80 on Server (which is running Apache)
using IPv4 over the GRE tunnel and request a large object. tcpdump
shows a TCP packet larger than the GRE tunnel's MTU being sent over GRE,
and with the GRE header this exceeds the virtual NIC's MTU too. The KVM
host drops the oversized GRE packet and replies with a ICMP "need to
frag". The TCP packet is resized and retransmitted and this gets
through, everything works:
ethertype IPv4, length 1530: Server_ipv4 > Router_ipv4: GREv0, proto
IPv4, length 1496: Server_ipv4.http > Client_ipv4.44247: Flags [.], seq
1:1441, ack 51, win 114, options [nop,nop,TS val 607339847 ecr
30576811], length 1440
ethertype IPv4, length 590: KVM_ipv4 > Server_ipv4: ICMP Router_ipv4
unreachable - need to frag (mtu 1500), length 556
ethertype IPv4, length 1506: Server_ipv4 > Router_ipv4: GREv0, proto
IPv4, length 1472: Server_ipv4.http > Client_ipv4.44247: Flags [.], seq
1:1417, ack 51, win 114, options [nop,nop,TS val 607339867 ecr
30577336], length 1416
But doing the same test using IPv6 over the GRE tunnel fails. tcpdump
shows an oversized TCP packet again, and again that gets passed on to
the KVM host as an oversized GRE packet, which gets dropped and an ICMP
"need to frag" returned. However, the TCP packet is never resized and
retransmitted, so the TCP session hangs:
ethertype IPv4, length 1530: Server_ipv4 > Router_ipv4: GREv0, proto
IPv6, length 1496: Server_ipv6.http > Client_ipv6.35711: Flags [.], seq
1:1421, ack 51, win 112, options [nop,nop,TS val 607991929 ecr
31228911], length 1420
ethertype IPv4, length 590: KVM_ipv4 > Server_ipv4: ICMP Router_ipv4
unreachable - need to frag (mtu 1500), length 556
So, it seems to me that initially the TCP packets are sized according to
the virtual NIC's MTU, since that is where the default routing table
says it will go. After being generated, the packets are then sent to
the GRE tunnel instead, which has a lower MTU. My expectation is that:
1. An IPv4 packet that exceeds the GRE tunnel's MTU should be dropped by
the GRE tunnel itself and an ICMP "need to frag" should be sent back to
the TCP stack, which should retransmit a smaller packet.
2. The same should be true for IPv6 - an IPv6 packet that exceeds the
GRE tunnel's MTU should be dropped by the GRE tunnel itself and an
ICMPv6 "packet too big" should be sent up to the TCP stack, which should
retransmit a smaller packet.
3. If a GRE packet causes an upstream router to return a "need to frag",
the GRE tunnel's MTU should be reduced accordingly and (1) or (2) should
happen.
As far as I can see, (1) and (2) aren't happening - oversized GRE
packets containing oversized IP packets are ending up at the KVM host.
(3) only seems to be working for IPv4 - the IPv6 stack never retransmits
a resized TCP packet.
Is this a bug, or am I missing something about how it should work?
Many thanks.
--
- Steve Hill
Technical Director
Opendium Limited http://www.opendium.com
Direct contacts:
Instant messager: xmpp:steve@...ndium.com
Email: steve@...ndium.com
Phone: sip:steve@...ndium.com
Sales / enquiries contacts:
Email: sales@...ndium.com
Phone: +44-1792-824568 / sip:sales@...ndium.com
Support contacts:
Email: support@...ndium.com
Phone: +44-1792-825748 / sip:support@...ndium.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists