netdev - IP fragmentation performance and don't fragment bug when forwarding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <462e25db-8aad-7687-31e5-fb812d8daeaa@gmail.com>
Date:   Sun, 2 Dec 2018 14:01:08 +0200
From:   Risto Pajula <or.pajula@...il.com>
To:     "David S. Miller" <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>
Cc:     netdev@...r.kernel.org
Subject: IP fragmentation performance and don't fragment bug when forwarding

Hello.

I have encountered a weird performance problem in Linux IP fragmentation 
when using video streaming services behind the NAT. Also I have studied 
a possible bug in the DF bit (don't fragment) handling when forwarding 
the IP packets.

First the system setup description:

[host1]-int lan-(eth1)[linux router](eth0)-extlan-[fibre router]-internet

where:
host1: is a Netgem N7800 "cable box" for online video streaming services 
provided by local telco (Can access Netflix, HBO nordic, "live TV", etc.)
linux router: Linux computer with Dualcore Intel Celeron G1840, running 
currently Linux kernel 4.20.0-rc2, and openSUSE Leap 15.0
eth1: Linux Routers internal (NAT) interface, 192.168.0.1/24 network, 
mtu set to 1500, RTL8169sb/8110sb
eth0: Linux Routers internet facing interface, public ip address, mtu 
set to 1500,  RTL8168evl/8111evl
fibre router: Alcatel Lucent fibre router (I-241G-Q), directly connected 
to the eth0 of the Linux router.

And now when using the Netgem N7800 with online video services (Netflix, 
HBO nordic, etc) the Linux router will receive very BIG IP packets in 
the eth0 upto ~20kB, this seems to lead to the following problems in the 
Linux IP stack.

IP fragmentation performance:
When the Linux router receives these large IP packets in the eth0 
everything works, but it seems that them cause very large performance 
degradation from internal network to the internet regarding the latency 
when the IP fragmentation is performed. The ping latency from internal 
network to the internel network increases from stable 15ms-20ms up to 
700-800ms AND also the ping from the internal network to the linux 
router eth1 (192.168.0.). However up link works perfectly, the ping is 
still stable when streaming the online services (From linux router to 
the internet). It seems that the IP fragmentation is somehow blocking 
the eth1 reception or transmission for very long time (which it 
shouldn't). I'm able to test and debug the issue further, but advice 
regarding where to look would be appreciated.

DF Bit, mtu bug when forwarding:
I have started to study the above mentioned problem and have found a 
possible bug in the DF bit and mtu handling in IP forwarding. The BIG 
packets received from streaming services all have the "DF bit" set and 
the question is that should we be forwarding them at all as that would 
result them being fragmented? Apparently we currently are... I have 
traced this down to the ip_forward.c function ip_exceeds_mtu(), and the 
following patch seems to fix that.

--- net/ipv4/ip_forward.c.orig  2018-12-02 11:09:32.764320780 +0200
+++ net/ipv4/ip_forward.c       2018-12-02 12:53:25.031232347 +0200
@@ -49,7 +49,7 @@ static bool ip_exceeds_mtu(const struct
                 return false;

         /* original fragment exceeds mtu and DF is set */
-       if (unlikely(IPCB(skb)->frag_max_size > mtu))
+        if (unlikely(skb->len > mtu))
                 return true;

         if (skb->ignore_df)

This seems to work (in some ways) - after the change IP packets that are 
too large to the internal network get dropped and we are sending "ICMP 
Destination unreachable, The datagram is too big" messages to the 
originator (as we should?). However it seems that not all services 
really like this... Netflix behaves as expected and ping is stable from 
internal network to the internet, but for example HBO nordic will not 
work anymore (too little buffering? Retransimissions not working?). So 
it seems the original issue should be also fixed (And the fragmention 
should be allowed?).

Any advice would be appreciated. Thanks!

PS. Watching TV was not this intensive 20 years ago :)

-- 
Risto Pajula
or.pajula@...il.com