lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Wed, 9 Jul 2008 03:38:58 -0700
From:	"Jerry Chu" <hkchu@...gle.com>
To:	johnwheffner@...il.com
Cc:	netdev@...r.kernel.org, aglo@...i.umich.edu, ranjitm@...gle.com
Subject: Re: setsockopt()

On Wed, Jul 9, 2008 at 2:55 AM, H.K. Jerry Chu <hkjerry.chu@...il.com> wrote:
>
>
> ---------- Forwarded message ----------
> From: John Heffner <johnwheffner@...il.com>
> Date: Mon, Jul 7, 2008 at 8:33 PM
> Subject: Re: setsockopt()
> To: Rick Jones <rick.jones2@...com>
> Cc: netdev@...r.kernel.org
>
>
> On Mon, Jul 7, 2008 at 3:50 PM, Rick Jones <rick.jones2@...com> wrote:
> > I'm still a triffle puzzled/concerned/confused by the extent to which
> > autotuning will allow the receive window to grow, again based on some
> > netperf experience thusfar, and patient explanations provided here and
> > elsewhere, it seems as though autotuning will let things get to 2x what it
> > thinks the sender's cwnd happens to be.  So far under netperf testing that
> > seems to be the case, and 99 times out of ten my netperf tests will have the
> > window grow to the max.
>
>
> Rick,
>
> I thought this was covered pretty thoroughly back in April.  The
> behavior you're seeing is 100% expected, and not likely to change
> unless Jerry Chu gets his local queued data measurement patch working.
>  I'm not sure what ultimately happened there, but it was a cool idea
> and I hope he has time to polish it up.  It's definitely tricky to get
> right.

Yes most certainly! I've had the non-TSO code mostly working for the
past couple of months (i.e., cwnd grows only to ~50KB on a local 1GbE
setup). But no such luck with TSO. Although the idea (exclusing pkts
still stuck in some queues inside the sending host from "in_flight" when
deciding whether needing to grow cwnd or not) seems simple, getting
the accounting right for TSO seems impossible. After catching and fixing
a slew of cases for 1GbE and seemingly getting close to the end of the
tunnel, I moved my tests to 10GbE last month and discovered accounting
leakage again. Basically my count of all the pkts still stuck inside the host
sometimes becomes larger than total in-flight. I have not figured out what
skb paths I might have missed, or is it possible the over-zealously-tuned
10G drivers are doing something funky?

Not to mention a number of other tricky scenario - e.g., when TSO
is enabled on 1GbE, the code works well for a netperf streaming test but
not the RR test with 1MB request size. After a while I discovered that
tcp_sendmsg() for the 1MB RR tests often runs in a tight loop without
flow control, hence always hitting snd_cwnd, even though acks have come
back. This is because the socket lock is only released during flow control.
The problem went away when I check and let the return traffic in inside the
tcp_sendmsg() loop. This kind of stuff can easily spoil my original simple
algorithm.

Jerry

> Jerry's optimization is a sender-side change.  The fact that the
> receiver announces enough window is almost certainly the right thing
> for it to do, and (I hope) this will not change.
>
> If you're still curious:
> http://www.psc.edu/networking/ftp/papers/autotune_sigcomm98.ps
> http://www.lanl.gov/radiant/pubs/drs/lacsi2001.pdf
> http://staff.psc.edu/jheffner/papers/senior_thesis.pdf
>
>  -John
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ