lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130315112516.4b1651ca@vostro>
Date:	Fri, 15 Mar 2013 11:25:16 +0200
From:	Timo Teras <timo.teras@....fi>
To:	netdev@...r.kernel.org
Subject: Re: linux-3.6+, gre+ipsec+forwarding = IP fragmentation broken

On Wed, 13 Mar 2013 17:14:53 +0200
Timo Teras <timo.teras@....fi> wrote:

> In the typical DMVPN setup with IPv4-ESP-GRE-IPv4 stack, it seems that
> IPv4 fragmentation got broke around 3.6 for forwarded packets.
> 
> It would seem that fragmentation works for locally generated packets.
> Also PMTU (DF set) seems to work for both forwarded and locally
> generated packets. But forwarded packets to gre device that gets IPsec
> encrypted do not get fragmented properly.
> 
> 3.4.x kernels work, 3.6 and 3.8 series tested and fail similarly.

Actually 3.4.x vanilla does not work. It works only with 38d523e "ipv4:
Remove output route check in ipv4_mtu" applied which I've been
cherry-picking to my builds.

> I was going through the changelog and it seems that MTU is now handled
> in nexthop exceptions and one needs to produce the full flow info to
> update it. I'm wonding if this does not hold true in my code path as
> ip_gre rewraps the forwarded packet and creates new IP header - when
> it next goes to the xfrm code (which sends the ICMP error) the inner
> iphdr is no longer accessible. Would this cause the breakage that I'm
> seeing? Or the forward flow's mtu still updated somehow?

I have now a theory on what goes wrong.

My gre tunnel is configured with 'ttl 64' so the tunnel IP header
always gets DF bit set to do proper path-mtu. The kind of locally
generated ICMP messages I get, imply that re-fragmentation happens only
on the tunnel's IPv4 header level - but it'll be too late then: the
large packet is queued, IPsec'ed and it is the IPsec'ed packet that
gets is tried to be fragmented (but it has DF set so it fails and
packet is dropped).

I believe ip_gre should explicitly fragment the inner IPv4 and IPv6
packets if the tunnel's ttl is not inherited (resulting in DF bit set
in the tunnel's IPv4 header).

So basically ip_gre worked wrong all along - things just happened to
work due to GRO/GSO not implemented in ip_gre, and the way (the now
deleted) routing cache exposed pmtu.

Does this make sense?

- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ