lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20100907.201843.179933180.davem@davemloft.net>
Date:	Tue, 07 Sep 2010 20:18:43 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	ilpo.jarvinen@...sinki.fi
Cc:	eric.dumazet@...il.com, leandroal@...il.com,
	netdev@...r.kernel.org, kuznet@....inr.ac.ru
Subject: Re: TCP packet size and delivery packet decisions

From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
Date: Tue, 7 Sep 2010 15:15:25 +0300 (EEST)

[ Alexey, problem is that when receiver's maximum window is miniscule
  (f.e. equal to MSS :-), we never send full MSS sized frames due to
  our sender size SWS implementation. ]

> On Tue, 7 Sep 2010, Eric Dumazet wrote:
> 
>> Le lundi 06 septembre 2010 à 22:30 -0700, David Miller a écrit :
>> 
>> 
>> > The small 78 byte window is why the sending system is splitting up the
>> > writes into smaller pieces.
>> > 
>> > I presume that the system advertises exactly a 78 byte window because
>> > this is how large the commands are.  But this is an extremely foolish
>> > and baroque thing to do, and it's why you are having problems.
>> 
>> I am not sure why TSO added a "Bound mss with half of window"
>> requirement for tcp_sync_mss()
> 
> I've thought it is more related on window behavior in general and 
> much much older than TSO (it certainly seems to be old one).
> 
> I guess we might run to some SWS issue if MSS < rwin < 2*MSS with your 
> patch that are avoided by the current approach?

Right, this clamping is part of RFC1122 silly window syndrome avoidance.

In ancient times we used to do this straight in sendmsg(), which had
the comment:

			/* We also need to worry about the window.  If
			 * window < 1/2 the maximum window we've seen
			 * from this host, don't use it.  This is
			 * sender side silly window prevention, as
			 * specified in RFC1122.  (Note that this is
			 * different than earlier versions of SWS
			 * prevention, e.g. RFC813.).  What we
			 * actually do is use the whole MSS.  Since
			 * the results in the right edge of the packet
			 * being outside the window, it will be queued
			 * for later rather than sent.
			 */

But in January 2000, Alexey Kuznetsov moved this logic into tcp_sync_mss().
netdev-vger-2.6 commit is:

--------------------
commit 214d457eb454a70f0f373371de044403834d8042
Author: davem <davem>
Date:   Tue Jan 18 08:24:09 2000 +0000

    Merge in bug fixes and small enhancements
    from Alexey for the TCP/UDP softnet mega-merge
    from the other day.
--------------------

Well, what is the SWS sender rule?  RFC1122 states that we should send
data if (wher U == usable window, D == data queued up but not yet sent):

1) if a maximum-sized segment can be sent, i.e, if:

   min(D,U) >= Eff.snd.MSS;

2)  or if the data is pushed and all queued data can
    be sent now, i.e., if:

        [SND.NXT = SND.UNA and] PUSHED and D <= U

    (the bracketed condition is imposed by the Nagle
    algorithm);

3)  or if at least a fraction Fs of the maximum window
    can be sent, i.e., if:

    [SND.NXT = SND.UNA and]

    min(D.U) >= Fs * Max(SND.WND);

4)  or if data is PUSHed and the override timeout
    occurs.

The recommmended value for the "Fs" fraction is 1/2, so that's where
the "one-half" logic comes from.

The current code implements this by pre-chopping the MSS to
half the largest window we've seen advertised by the peer, f.e.
when doing sendmsg() packetization.  Packets are packetized
to this 1/2 MAX_WINDOW MSS, represented by mss_now.

Then the send path SWS logic will send any packet that is at least
"mss_now".

Effectively we try to implement test #1 and #3 above at the same
time by just making test #1 and chopping the Eff.snd.MSS by half
the largest receive window we've seen advertised by the peer.

This strategy seems to break down when the peer's MSS and the maximum
receive window we'll ever see from the peer are the same order of
magnitude.

It seems that the conditions above really need to be checked in the
right order, and because we try to combine a later test (case #3) with
an earlier test (case #1) we don't send a full sized frame in this
special scenerio.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ