[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <64829c98-e4eb-6725-0fee-dc3c6681506f@bluematt.me>
Date: Wed, 28 Apr 2021 12:35:28 -0400
From: Matt Corallo <netdev-list@...tcorallo.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Willy Tarreau <w@....eu>, "David S. Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Keyu Man <kman001@....edu>
Subject: Re: [PATCH net-next] Reduce IP_FRAG_TIME fragment-reassembly timeout
to 1s, from 30s
On 4/28/21 11:38, Eric Dumazet wrote:
> On Wed, Apr 28, 2021 at 4:28 PM Matt Corallo
> <netdev-list@...tcorallo.com> wrote:
> I have been working in wifi environments (linux conferences) where RTT
> could reach 20 sec, and even 30 seconds, and this was in some very
> rich cities in the USA.
>
> Obviously, when a network is under provisioned by 50x factor, you
> _need_ more time to complete fragments.
Its also a trade-off - if you're in a hugely under-provisioned environment with bufferblot issues you may have some
fragments that need more time for reassembly if they've gotten horribly reordered (though just having 20 second RTT
doesn't imply that fragments are going to be re-ordered by 20 seconds, more likely you might see a small fraction of
it), but you're also likely to have more *lost* fragments, which can trigger the black-holing behavior here.
If you have some loss in the flow, its very easy to hit 1Mbps of lost fragments and suddenly instead of just giving more
time to reassemble, you're just black-holing instead. I'm not claiming I have the right trade-off here, I'd love more
input, but at least in my experience trying to just occasionally send fragments on a pretty standard DOCSIS modem, 30s
is way off.
> For some reason, the crazy IP reassembly stuff comes every couple of
> years, and it is now a FAQ.
>
> The Internet has changed for the lucky ones, but some deployments are
> using 4Mbps satellite connectivity, shared by hundreds of people.
I'd think this is a great example of a case where you precisely *dont* want such a low threshold for dropping all
fragments. Note that in my specific deployment (standard DOCSIS), we're talking about the same speed and network as was
available ten years ago, this isn't exactly an uncommon or particularly fancy deployment. The real issue is applications
which happily send 8MB of fragments within a few seconds and suddenly find themselves completely black-holed for 30
seconds, but this isn't going to just go away.
> I urge application designers to _not_ rely on doomed frags, even in
> controlled networks.
I'd love to, but we're talking about a default value for fragment reassembly. At least in my subjective experience here,
the conservative 30s time takes things from "more time" to "completely blackhole", which feels like the wrong tradeoff.
At the end of the day, you can't expect fragments to work super well, indeed, and you assume some amount of loss, the
goal is to minimize the loss you see from them.
Even if you have some reordering, you're unlikely to see every fragment reordered (I guess you could imagine a horribly
broken qdisc, does such a thing exist in practice?) such that you always need 30s to reassemble. Taking some loss to
avoid making it so easy to completely blackhole fragments seems like a reasonable tradeoff.
Matt
Powered by blists - more mailing lists