netdev - Re: [PATCH -next 3/3] ipv4: don't remove df bit when refragmenting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150412085122.GA14720@breakpoint.cc>
Date:	Sun, 12 Apr 2015 10:51:22 +0200
From:	Florian Westphal <fw@...len.de>
To:	Florian Westphal <fw@...len.de>
Cc:	netdev@...r.kernel.org, kaber@...sh.net, jesse@...ira.com
Subject: Re: [PATCH -next 3/3] ipv4: don't remove df bit when refragmenting

Florian Westphal <fw@...len.de> wrote:
> We always send fragments without DF bit set.
> 
> Thus, given following setup:
> 
> mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
>    A           R1               R2         B
> 
> Where R1 and R2 run linux with netfilter defragmentation/conntrack
> enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
> will respond with icmp too big error if one of these fragments exceeded
> 1400 bytes.  So far, so good.
> 
> However, the host A will never learn about the lower 1280 link.
> The next packet presumably sent by A would use 1400 as the new fragment
> size, but R1 will strip DF bit when refragmenting.
> 
> Whats worse: If R1 receives fragment sizes 1200 and 100, it would
> forward the reassembled packet without refragmenting, i.e.
> R2 would send an icmp error in response to a packet that was never sent,
> citing mtu that the original sender never exceeded.
> 
> In order to 'replay' the original fragments to preserve their semantics,
> one solution is to
> 
>  1. set DF bit on the new fragments if it was set on original ones.
>  2. set the size of the new fragments generated by R1 during
>     refragmentation to the largest size seen when defragmenting.
> 
> R2 will then notice the problem and will send the expected
> 'too big, use 1280' icmp error, and further fragments of this size
> would not grow anymore to 1400 link mtu when R1 refragments.
> 
> There is however, one important caveat. We cannot just use existing
> IPCB(skb)->frag_max_size as upper boundary for refragmentation.
> 
> We have to consider a case where we receive a large fragment without DF,
> followed by a small fragment with DF set.
> 
> In such scenario we must not generate a large spew of small DF-fragments
> (else we induce packet/traffic amplification).
> 
> This modifies ip_fragment so that we track largest fragment size seen
> both for DF and non-DF packets.
> 
> Then, when we find that we had at least one DF fragment AND the largest
> non-DF fragment did not exceed one with DF set, let ip_fragment know that
> it should refragment using original frag size and also set DF bit on the
> newly created fragments.

Seems Jesse would prefer if we'd set max_frag_size unconditionally.

The advantage of doing that would be that we can easily get rid of the
nf_bridge_mtu_reduction() kludge we have right now since we'd just pick
the largest original fragment size.

Only caveat is that we'd have to check IPSKB_FRAG_PMTU flag too before
deciding to reject if out mtu is too small, which is easy to do.

So, before I respin patch #3: What do others think?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html