netdev - PROBLEM: MTU of ipsec tunnel drops continuously until traffic stops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <1467604370799.64629@alliedtelesis.co.nz>
Date:	Mon, 4 Jul 2016 03:52:50 +0000
From:	Matt Bennett <Matt.Bennett@...iedtelesis.co.nz>
To:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:	Steffen Klassert <steffen.klassert@...unet.com>,
	Herbert Xu <herbert@...dor.apana.org.au>
Subject: PROBLEM: MTU of ipsec tunnel drops continuously until traffic stops

*Resending as plain text so the mailing list accepts it.. Sorry Steffen and Herbert*

Hi,

During long run testing of an ipsec tunnel over a PPP link it was found that occasionally traffic would stop flowing over the tunnel. Eventually the traffic would start again, however using the command "ip route flush cache" causes traffic to start flowing  again immediately.

Note, I am using a 4.4.6 based kernel, however I see no major differences between 4.4.6 and 4.4.14 (current LTS) in any of the code I am debugging. I  have manually debugged the code as far as I can, however I don't know the code well enough to make further progress. What I have uncovered is outlined below:

By pinging the other end of the tunnel when the traffic stops flowing I get messages like the following:

10-AR4050#ping 172.16.0.5
PING 172.16.0.5 (172.16.0.5) 56(84) bytes of data.
From 172.16.0.6 icmp_seq=1 Frag needed and DF set (mtu = 46)
From 172.16.0.6 icmp_seq=2 Frag needed and DF set (mtu = 46)

but this is weird considering (note the mtu values):

[root@...AR4050 /flash]# ip link
16778240: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1492 qdisc htb state UP mode DEFAULT group default qlen 3
    link/ppp 
14: tunnel64@...E: <POINTOPOINT,MULTICAST,UP,LOWER_UP> mtu 1200 qdisc htb state UNKNOWN mode DEFAULT group default qlen 1
    link/ipip 203.0.113.10 peer 203.0.113.5

The code that generates the ICMP_FRAG_NEEDED packet is vti_xmit() (ip_vti.c) where there is a check of skb length against the mtu of dst entry. Since the mtu is lower than the packet (debug shows the mtu is 46 as expected from the ping output) the ICMP  error is generated.

Digging further I find that when the issue occurs the mtu value is being updated in what appears to be an error case in xfrm_bundle_ok (xfrm_policy.c). Specifically the block of code:

if (likely(!last))
        return 1;

is not hit meaning there is a difference between the cached mtu value and the value just calculated. I then see this code being hit continuously and each time the mtu keeps getting lowered. i.e. (I don't know if the drop by 80 bytes is significant)

1200
1118
1038
958
878
 ....
46

At this point I don't know how to debug further. It appears there is a disparity between cached mtu values and what is calculated in xfrm_bundle_ok(). Hopefully this is enough information to explain the problem I am seeing. Please let me know if you need more information.

Thanks,
Matt