lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101208132217.GA28040@verge.net.au>
Date:	Wed, 8 Dec 2010 22:22:17 +0900
From:	Simon Horman <horms@...ge.net.au>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org, Jay Vosburgh <fubar@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow
 control regression]

On Sat, Nov 06, 2010 at 06:25:37PM +0900, Simon Horman wrote:
> On Tue, Nov 02, 2010 at 10:29:45AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 17:46 +0900, Simon Horman a écrit :
> > 
> > > Thanks Eric, that seems to resolve the problem that I was seeing.
> > > 
> > > With your patch I see:
> > > 
> > > No bonding
> > > 
> > > # netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438413      0      957.2     8.52     1.458 
> > > 129024           30.00     2438413             957.2     -1.00    -1.000
> > > 
> > > With bonding (one slave, the interface used in the test above)
> > > 
> > > netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438390      0      957.1     8.97     1.535 
> > > 129024           30.00     2438390             957.1     -1.00    -1.000
> > > 
> > 
> > 
> > Sure the patch helps when not too many flows are involved, but this is a
> > hack.
> > 
> > Say the device queue is 1000 packets, and you run a workload with 2000
> > sockets, it wont work...
> > 
> > Or device queue is 1000 packets, one flow, and socket send queue size
> > allows for more than 1000 packets to be 'in flight' (echo 2000000
> > >/proc/sys/net/core/wmem_default) , it wont work too with bonding, only
> > with devices with a qdisc sitting in the first device met after the
> > socket.
> 
> True, thanks for pointing that out.
> 
> The scenario that I am actually interested in is virtualisation.
> And I believe that your patch helps the vhostnet case (I don't see
> flow control problems with bonding + virtio without vhostnet). However,
> I am unsure if there are also some easy work-arounds to degrade
> flow control in the vhostnet case too.

Hi Eric,

do you have any thoughts on this?

I measured the performance impact of your patch on 2.6.37-rc1
and I can see why early orphaning is a win.

The tests are run over a bond with 3 slaves.
The bond is in rr-balance mode. Other parameters of interest are:
	MTU=1500
	client,server:tcp_reordering=3(default)
	client:GSO=off,
	client:TSO=off
	server:GRO=off
	server:rx-usecs=3(default)

Without your no early-orphan patch
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1621.03   16.31    6.48     1.648   2.621

With your no early-orphan patch
# netperf -C -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1433.48   9.60     5.45     1.098   2.490


However in the case of virtualisation I think it is a win to be able to do
flow control on UDP traffic from guests (using vitio). Am I missing
something and flow control can be bypassed anyway? If not perhaps making
the change that your patch makes configurable through proc or ethtool is an
option?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ