lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Mar 2011 17:18:38 +0100
From:	Carsten Wolff <carsten@...ffcarsten.de>
To:	Yuchung Cheng <ycheng@...gle.com>
Cc:	David Miller <davem@...emloft.net>,
	Ilpo Jarvinen <ilpo.jarvinen@...sinki.fi>,
	Nandita Dukkipati <nanditad@...gle.com>,
	netdev@...r.kernel.org,
	Alexander Zimmermann <alexander.zimmermann@...sys.rwth-aachen.de>
Subject: Re: [PATCH] tcp: avoid cwnd moderation in undo

Hi again,

On Wednesday 16 March 2011, Yuchung Cheng wrote:
> On Tue, Mar 15, 2011 at 3:07 AM, Carsten Wolff <carsten@...ffcarsten.de> 
wrote:
> > Hi,
> > 
> > On Monday 14 March 2011, Yuchung Cheng wrote:
> > > On Mon, Mar 14, 2011 at 3:06 AM, Carsten Wolff
> > > <carsten@...ffcarsten.de>
> > 
> > wrote:
> > > In the presence of reordering, cwnd is already moderated in Disorder
> > > state before
> > >  entering the (false) recovery.
> > 
> > Sure, cwnd moderation to in_flight + 1 segment is applied in disorder
> > state,
> 
> it's in_flight + 3 usually. the moderation first happens
> tcp_try_to_open() instead of tcp_cwnd_down()

In disorder state, tcp_try_to_open() calls tcp_cwnd_down() which clamps cwnd 
to in_flight + 1 for dupacks (where tcp_packets_in_flight() is not to be 
confused with the IN_FLIGHT variable in IETF documents, which is called 
packets_out in Linux ...). Otherwise, Linux would be violating RFC3042, which 
allows to send one SMSS of data on each dupack before recovery (actually, just 
the first two, but since the DupThresh can be larger than 3 in linux, it 
extends Limited Transmit to more than just the first two dupacks). This is 
mostly equivalent to the aggressive variant of extended limited transmit in 
RFC4653.

> > because this is implementing a form of extended limited transmit.
> > Nevertheless, after a reordering event that caused a spurious fast
> > retransmit, there can be an undo of congestion state changes (either
> > after recovery or interrupting recovery, depending on the options
> > enabled in the connection). I just wanted to point out, that the
> > moderation step happening upon an undo may allow a larger burst, if a
> > previous reordering event was detected and caused tp->reordering to be
> > increased.
> 
> Your point is that cwnd should be moderated on reordering (in undo or
> other events). Point taken.
>  My point is that cwnd does not need to be moderated on false
> recoveries. Do you agree?
> To implement your design, tcp_update_reordering should do
> tcp_cwnd_moderation().
> To implement my point, the moderations should be avoided in undo
> operations.
> 
> The two aren't in conflict. But there are cases that have both undo
> and reordering.
> Are we on the same page?

Unfortunately, no. ;-) My point is, that cwnd should be moderated when the 
congestion state changes are undone after a spurious recovery has been 
detected. Reordering is only one possible reason for a false recovery. And I 
stick to that point because of the thoughts I pointed out in my mail to john, 
i.e. undo typically leading to exceptionally large segment bursts.

As for cwnd moderation upon the detection of a reordering event (that's a 
different thing at a differnt point in time than detection of a false 
recovery!): This wouldn't make sense to me. The detection of the reordering 
event together with a metric that measures the extent of the reordering can be 
used to try and prevent false recoverys in future reordering events, by 
delaying the congestion reaction (i.e. fast retransmit) then.

Reordering can be a cause of spurious recovery. But undo mechanisms and 
mechanisms to prevent false recovery(s) are orthogonal.

Your patch touches all undos, while reordering is just an example for a cause 
of false recovery.

> > > > More importantly, the prior ssthresh is restored and not affected by
> > > > moderation. This means, if moderation reduces cwnd to a small value,
> > > > then cwnd < ssthresh and TCP will quickly slow-start back to the
> > > > previous state, without sending a big burst of segments.
> > 
> > This is actually the more important point, because it means that the
> > moderation does not negate the effects of the undo operation, as
> > suggested by your patch-description.
> 
> It's a double-edge sword. Why slow-start if there is no real loss?

Its a timing thing. I mean, it is an undo operation: the harm has been done, 
some opportunity to send new data has been lost. Trying to send all that data 
at once now without an ACK-clock will cause more harm when buffers are under 
pressure. The undo operation should not try to make up for lost opportunity, 
only try to reduce further loss of opportunity to send new data. For this, the 
segment bursts have to be moderated.

> It
> hurts short
> request-response type traffic performance badly b/c each undo makes cwnd =
> 3.
> 
> > False fast retransmits are mostly caused by reordering, spurious RTOs can
> > also be caused by delay variations that do not exhibit reordering. Your
> > patch touches all cases of spurious events. Anyway, I just mentioned
> > reordering, because it is the event in which Linux already allows larger
> > bursts of size tp->reordering in the moderation function (i.e.
> > tp->reordering might be increased). It's also not important to me if the
> > undo is happening duringor after recovery, the important question is, if
> > burst protection in general is an important goal, or not (and I think
> > it's there for a reason).
> 
> I am hoping my previous explanation make sense to you (these two points are
> not in conflict).

I hope the same for my explanations. :-)

Cheers
Carsten
-- 
           /\-ยด-/\
          (  @ @  )
________o0O___^___O0o________

Download attachment "signature.asc " of type "application/pgp-signature" (191 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ