netdev - Re: [PATCH net-next 2/2] tcp: add CDG congestion control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA++eYduZ0-NL_5j7SdzPUsGeF8KJHTV1-gzd2R3_z0qqgsD7sw@mail.gmail.com>
Date:	Fri, 5 Jun 2015 22:23:21 +0200
From:	Kenneth Klette Jonassen <kennetkl@....uio.no>
To:	Yuchung Cheng <ycheng@...gle.com>
Cc:	Kenneth Klette Jonassen <kennetkl@....uio.no>,
	netdev <netdev@...r.kernel.org>,
	Eric Dumazet <edumazet@...gle.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	Neal Cardwell <ncardwell@...gle.com>,
	David Hayes <davihay@....uio.no>,
	Andreas Petlund <apetlund@...ula.no>,
	Dave Taht <dave.taht@...ferbloat.net>,
	Nicolas Kuhn <nicolas.kuhn@...ecom-bretagne.eu>
Subject: Re: [PATCH net-next 2/2] tcp: add CDG congestion control

> nice patch. I would like to review it more thoroughly but I have some
> quick comments.
>
> given that cdg didn't include hystart. it'd be nice to have a module knob to
> enable and disable hystart for experiments.

We could include a knob to disable PRR for delay backoffs as well.
Neither are/were available in FreeBSD.

>> +       u32 now_us = local_clock() / NSEC_PER_USEC;
> now_us = mstamp_us_get();

The reason we went with u32 timestamps here instead of struct
skb_mstamp's is lack of sufficient space in struct cdg (using 2*4
bytes vs. 2*8 bytes).

There is no mstamp_us_get() in net-next? We could make one using
skb_mstamp_get(), but afaics the ACK train does not need stronger
clock guarantees than what local_clock() already offers (per-CPU
monotonicity).

>> +       if (tp->snd_cwnd < tp->snd_ssthresh)
>> +               tcp_cdg_hystart_update(sk);
> nit: reno and cubic slow starts w/ snd_cwnd == ssthresh. maybe keep
> this consistent?

I think the question is why tcp_cubic invokes HyStart when snd_cwnd == ssthresh?

Let us establish that HyStart is not a stand-alone slow start
algorithm. HyStart does not grow cwnd, but merely helps estimate a
ssthresh to avoid slow start overshoot.

If snd_cwnd == ssthresh, then HyStart's result is moot; we cannot
possibly overshoot less at that point. Imho, unconditionally invoking
HyStart in slow start only serves to mislead us into believing that it
is a slow start algorithm when it is not.

>> +       tcp_reno_cong_avoid(sk, ack, acked);

For the record, CDG still uses reno (i.e. slow start when snd_cwnd == ssthresh).

>> +
>> +       if (ca->shadow_wnd) {
>> +               ca->shadow_wnd += tp->snd_cwnd - prior_snd_cwnd;
>> +               if (ca->shadow_wnd > tp->snd_cwnd_clamp)
>> +                       ca->shadow_wnd = tp->snd_cwnd_clamp;
> Not sure why use snd_cwnd_clamp here. looks like it's applied in
> tcp_reno_cong_avoid().

The shadow window can be larger than snd_cwnd.

It behaves like the regular snd_cwnd, but it is immune to delay
backoffs. If loss occurs, presumably because of competing flows, the
shadow window behaves in a way that "undo" previous delay backoffs.

For example, assume cwnd = shadow = 20. A delay backoff reduces cwnd
to cwnd*0.7 = 14. A succeeding loss backoff would then reduce cwnd to
max(cwnd, shadow)*0.5 = 10, because shadow is still 20. (If we did not
have the shadow window, then the new cwnd after loss would have been
7.)

>> +        * Assume num_acked == 0 indicates RTT measurement from SACK.
>> +        */
> another heuristic to try is to look at tp->sacked_out. when it's
> positive the receiver should not be delaying acks (hopefully).
>

Hopefully. ;)

On pure SACK, num_acked == 0.
But if we get combined SACK + cumulative ACK of 1 segment, then num_acked is 1.

>> +       if (num_acked == 1 && ca->delack) {
So we should extend this check to be (num_acked == 1 && ca->delack &&
!tp->sacked_out).

Thank you for spotting this.

>> +       /* If non-ECN: */
>> +       if (tp->prior_ssthresh) {
> no need to check prior_ssthresh here b/c it's checked in
> tcp_undo_cwnd_reduction() already.

If we get ECE from the receiver (ECN signal), then cwnd undo is disabled.

The above check ensures that the loss tolerance heuristic can only be
used if cwnd undo is possible.

As an alternative, perhaps ssthresh could be extended with a  second
parameter to indicate why it was invoked:
ssthresh(sk, TCP_SIGNAL_LOSS)
ssthresh(sk, TCP_SIGNAL_ECN)
ssthresh(sk, TCP_SIGNAL_PRIVATE)

Then tcp_cdg_ssthresh could do:
iff signal == TCP_SIGNAL_LOSS && use_tolerance, use loss tolerance heuristic.
iff signal == TCP_SIGNAL_PRIVATE, calculate delay backoff cwnd
...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html