[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <5C507E79-DD05-400A-B4A9-364EEEA12C08@gmail.com>
Date: Tue, 1 May 2018 21:31:58 +0300
From: Jonathan Morton <chromatix99@...il.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Dave Taht <dave.taht@...il.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Cake List <cake@...ts.bufferbloat.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [Cake] [PATCH net-next v6] Add Common Applications Kept Enhanced
(cake) qdisc
> On 1 May, 2018, at 7:06 pm, Eric Dumazet <eric.dumazet@...il.com> wrote:
>
> You have not provided any numbers to show how useful it is to maintain this
> code (probably still broken btw, considering it is changing some skb attributes).
A simple calculation shows what the maximum tolerable asymmetry for a conventional TCP connection is. If the MTU is 1500, a pure ack is 84 bytes, and an ack is sent for every 3 packets received, bandwidth asymmetry exceeding about 50:1 will inevitably result in the inability to fully utilise downstream bandwidth. This ratio drops to 34:1 if acks are sent every two data packets, and 17:1 for the RFC-mandated "ack every packet" behaviour when loss is detected.
Physical asymmetry ratios exceeding 20:1 are not rare. They are increased still further if reverse traffic is present; given a nominal 10:1 asymmetry, interleaving one ack with one MTU packet (as SFQ would) gives an *effective* asymmetry for the downstream flow of 188:1, and seriously affects downstream goodput.
High asymmetry ratios can be tolerated by TCPs implementing sparser acks than 1:3 ratios, as proposed in AckCC. Without AckCC however, I understand a strict reading of the RFCs prohibits TCPs from going beyond 1:3. Even if Linux TCPs do so anyway, billions of Windows, Mac and mobile hosts most likely will not. This makes a solely end-to-end solution impractical for the time being, with no obvious hope of improvement.
To maintain downstream goodput under these conditions, it is either necessary to send bursts of acks between the upstream traffic (which is slightly wasteful of bandwidth and also upsets RTT-sensitive TCPs like BBR) - which Cake will do quite happily if the ack-filter is disabled - or to delete some of the acks. When there are many upstream flows competing with a single ack flow, the latter is the only reasonable solution unless acks are hard-prioritised (which has negative effects of its own, so we didn't consider it).
Middleboxes *are* permitted to drop packets, and AQM routinely does so. We found that AQM could delete acks beneficially, but that it ramped up too slowly to really solve the problem (acks being individually small, much larger numbers of them must be dropped once they become a saturating flow, compared to a data flow). Also, ack flows are often *antiresponsive*, in that dropping some of them causes an increase in their arrival rate due to increasing the ack-clocked downstream traffic, so it should be moderately obvious that conventional AQM strategies don't apply. We also looked at existing ack filters' behaviour and found it wanting, so we were initially reluctant to implement our own.
However, having seen many counterexamples of how *not* to write an ack filter, we were eventually able to write one that *didn't* break the rules, and ensured that the information TCP relies on remained in acks that were not deleted even when many acks were. This includes preserving the triple-repetition rule for detecting packet loss, and preserving the tail ack signifying reception of the end of a flow - which naturally results in only dropping acks if it's worthwhile to do so. In short, we're fully aware of the potential for breaking TCP this way, and we've done our level best to avoid it.
Testing revealed an appreciable, simultaneous improvement in both downstream and upstream goodput, and in the downstream flow's internal RTT, under appropriate traffic conditions. A more aggressive variant, going to the edges of what might be allowed by RFCs, actually showed *less* improvement than the standard one - it interfered with TCP's behaviour too much. We can dig up the data if required.
> Also on wifi, the queue builds in the driver queues anyway, not in the qdisc.
> So ACK filtering, if _really_ successful, would need to be modularized.
Agree that the wifi queues would also benefit from ack filtering. In fact, with make-wifi-fast improvements active, installing Cake or any other qdisc on that interface should be unnecessary. Cake's shaper can't easily adapt to variable-rate links on the timescales required, anyway. (It could if something told it directly about the variations.)
However, I don't see a way to install the ack-filter as a separate entity in its own right, without the ability to manipulate the queue downstream of itself. The arrival of a new ack may trigger the deletion of a previous one, never the one just arrived, yet keeping an internal queue of acks within the filter would be counterproductive. That's why we implemented it within Cake instead of as a separate thing.
It may be possible to modularise it sufficiently that qdiscs could add support for ack-filtering without duplicating the code, rather than the other way around. This would also allow wifi queues to be modified the same way. Would that be acceptable?
- Jonathan Morton
Powered by blists - more mailing lists