netdev - IP_FRAG_TIME Default Too Large

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <b024bedb-d9e8-ee04-2443-2804760f51e4@mattcorallo.com>
Date:   Mon, 29 Mar 2021 20:04:04 -0400
From:   Matt Corallo <linux-net@...tcorallo.com>
To:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: IP_FRAG_TIME Default Too Large

IP_FRAG_TIME defaults to 30 full long seconds to wait for reassembly of fragments. In practice, with the default values, 
if I send enough fragments over a line that there is material loss, its not strange to see fragments be completely 
dropped for the remainder of a 30 second time period before returning to normal.

This issue largely goes away when setting net.ipv4.ipfrag_time to 0/1. Is there a reason IP_FRAG_TIME defaults to 
something so high? If its been 30 seconds the packet you receive next is almost certainly not the one you wanted.

That said, if I'm reading ip_fragment.c (and I'm almost certainly not), the behavior seems different than as documented 
- q.fqdir->timeout is only used in ip_frag_reinit which is only called when ip_frag_too_far hits indicating the packet 
is out of the net.ipv4.ipfrag_max_dist bound.

Reading the docs, I expected something more like "if the packet is out of the net.ipv4.ipfrag_max_dist bound, drop the 
queue, also if the packet is older than net.ipv4.ipfrag_time, drop the packet", not "if the packet is out of the 
net.ipv4.ipfrag_max_dist bound *and* the packet is older than net.ipv4.ipfrag_time, drop the queue". If I'm reading it 
right, this doesn't seem like what you generally want to happen - eg in my case if you get some loss on a flow that 
contains fragments its very easy to end up with all fragments lost until you meet the above criteria and drop the queue 
after 30 seconds, instead of making a best effort to reassemble new packets as they come in, dropping old ones.

Thanks,
Matt

(Note: not subscribed, please keep me on CC when responding)