lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0cb19f7e-a9b3-58f8-6119-0736010f1326@bluematt.me>
Date:   Wed, 28 Apr 2021 10:09:00 -0400
From:   Matt Corallo <netdev-list@...tcorallo.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Willy Tarreau <w@....eu>, Keyu Man <kman001@....edu>
Subject: Re: [PATCH net-next] Reduce IP_FRAG_TIME fragment-reassembly timeout
 to 1s, from 30s



On 4/28/21 08:20, Eric Dumazet wrote:
> This is going to break many use cases.
> 
> I can certainly say that in many cases, we need more than 1 second to
> complete reassembly.
> Some Internet users share satellite links with 600 ms RTT, not
> everybody has fiber links in 2021.

I'm curious what RTT has to do with it? Frags aren't resent, so there's no RTT you need to wait for, the question is 
more your available bandwidth and how much packet reordering you see, which even for many sat links isn't zero anymore 
(better, in-flow packet reordering is becoming more and more rare!).

Even given some material reordering, 30 seconds on a 100Kb is a lot!

> There is a sysctl, exactly for the cases where admins can decide to
> make the value smaller.

Sadly this doesn't actually solve it in many cases. Because Linux reassembles fragments by default any time conntrack is 
loaded (disabling this is very nontrivial), anyone with a Linux box in between two hosts ends up breaking flows with any 
material loss of frags.

More broadly, just because there is a sysctl, doesn't mean the default needs to be sensible for most users. As you note, 
there's a sysctl, if someone is on a 1Kbps sat link with fragments sent out of order, they can change it :). This 
constant hasn't been touched since pre-git!

> You can laugh all you want, the sad thing with IP frags is that really
> some applications still want to use them.

Yes, including my application, which breaks any time the flow *transits* a Linux box (ie not just my end host(s), but 
any box in between that happens to have conntrack loaded).

> Also, admins willing to use 400 MB of memory instead of 4MB can just
> change a sysctl.
> 
> Again, nothing will prevent reassembly units to be DDOS targets.

Yep, not claiming any differently. As noted in a previous thread you really have to crank up the limits to prevent DDOS.

> At Google, we use 100 MB for /proc/sys/net/ipv4/ipfrag_high_thresh and
> /proc/sys/net/ipv6/ip6frag_high_thresh,
> no kernel patch is needed.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ