[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANXY5yJeCeC_FaQHx0GPn88sQCog59k2vmu8o-h6yRrikSQ3vQ@mail.gmail.com>
Date: Mon, 3 Aug 2020 21:39:34 +0300
From: mastertheknife <mastertheknife@...il.com>
To: David Ahern <dsahern@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: PMTUD broken inside network namespace with multipath routing
Hi David,
I found something that can shed some light on the issue.
The issue only happens if the ICMP response doesn't come from the first nexthop.
In my case, both nexthops are linux routers, and they are the ones
generating the ICMP (because of IPSEC next). This is what I meant
earlier,
that the ICMP path is identical to the original message path.
Test IP #1 - 192.168.249.116 - Hash will choose nexthop #1
Test IP #2 - 192.168.249.117 - Hash will choose nexthop #2
Test with 252.250 as nexthop #1:
--------------------------------
root@...test:[~] # ip route add 192.168.249.0/24 dev eth1 nexthop via
192.168.252.250 dev eth1 nexthop via 192.168.252.252 dev eth1
root@...test:[~] # ping -M do -s 1450 192.168.249.116
PING 192.168.249.116 (192.168.249.116) 1450(1478) bytes of data.
>From 192.168.252.250 icmp_seq=1 Frag needed and DF set (mtu = 1446)
ping: local error: Message too long, mtu=1446
ping: local error: Message too long, mtu=1446
ping: local error: Message too long, mtu=1446
^C
--- 192.168.249.116 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3067ms
root@...tlxc:[~] # ping -M do -s 1450 192.168.249.117
PING 192.168.249.117 (192.168.249.117) 1450(1478) bytes of data.
>From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1446)
>From 192.168.252.252 icmp_seq=2 Frag needed and DF set (mtu = 1446)
>From 192.168.252.252 icmp_seq=3 Frag needed and DF set (mtu = 1446)
^C
--- 192.168.249.117 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2052ms
Test with 252.252 as nexthop #1:
--------------------------------
root@...tlxc:[~] # ip route add 192.168.249.0/24 dev eth1 nexthop via
192.168.252.252 dev eth1 nexthop via 192.168.252.250 dev eth1
root@...tlxc:[~] # ping -M do -s 1450 192.168.249.116
PING 192.168.249.116 (192.168.249.116) 1450(1478) bytes of data.
>From 192.168.252.252 icmp_seq=1 Frag needed and DF set (mtu = 1446)
ping: local error: Message too long, mtu=1446
ping: local error: Message too long, mtu=1446
^C
--- 192.168.249.116 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2044ms
root@...tlxc:[~] # ping -M do -s 1450 192.168.249.117
PING 192.168.249.117 (192.168.249.117) 1450(1478) bytes of data.
>From 192.168.252.250 icmp_seq=1 Frag needed and DF set (mtu = 1446)
>From 192.168.252.250 icmp_seq=2 Frag needed and DF set (mtu = 1446)
>From 192.168.252.250 icmp_seq=3 Frag needed and DF set (mtu = 1446)
^C
--- 192.168.249.117 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2046ms
In summary: It seems that it doesn't matter who is the nexthop. If the
ICMP response isn't from the nexthop, it'll be rejected.
About why i couldn't reproduce this outside LXC, i don't know yet but
i will keep trying to figure this out.
Let me know if you need me to test this.
Thank you,
Kfir Itzhak
On Mon, Aug 3, 2020 at 6:38 PM David Ahern <dsahern@...il.com> wrote:
>
> On 8/3/20 8:24 AM, mastertheknife wrote:
> > Hi David,
> >
> > In this case, both paths are in the same layer2 network, there is no
> > symmetric multi-path routing.
> > If original message takes path 1, ICMP response will come from path 1
> > If original message takes path 2, ICMP response will come from path 2
> > Also, It works fine outside of LXC.
> >
> >
>
> I'll take a look when I get some time; most likely end of the week.
Powered by blists - more mailing lists