[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101208132217.GA28040@verge.net.au>
Date: Wed, 8 Dec 2010 22:22:17 +0900
From: Simon Horman <horms@...ge.net.au>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: netdev@...r.kernel.org, Jay Vosburgh <fubar@...ibm.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow
control regression]
On Sat, Nov 06, 2010 at 06:25:37PM +0900, Simon Horman wrote:
> On Tue, Nov 02, 2010 at 10:29:45AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 17:46 +0900, Simon Horman a écrit :
> >
> > > Thanks Eric, that seems to resolve the problem that I was seeing.
> > >
> > > With your patch I see:
> > >
> > > No bonding
> > >
> > > # netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket Message Elapsed Messages CPU Service
> > > Size Size Time Okay Errors Throughput Util Demand
> > > bytes bytes secs # # 10^6bits/sec % SU us/KB
> > >
> > > 116736 1472 30.00 2438413 0 957.2 8.52 1.458
> > > 129024 30.00 2438413 957.2 -1.00 -1.000
> > >
> > > With bonding (one slave, the interface used in the test above)
> > >
> > > netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket Message Elapsed Messages CPU Service
> > > Size Size Time Okay Errors Throughput Util Demand
> > > bytes bytes secs # # 10^6bits/sec % SU us/KB
> > >
> > > 116736 1472 30.00 2438390 0 957.1 8.97 1.535
> > > 129024 30.00 2438390 957.1 -1.00 -1.000
> > >
> >
> >
> > Sure the patch helps when not too many flows are involved, but this is a
> > hack.
> >
> > Say the device queue is 1000 packets, and you run a workload with 2000
> > sockets, it wont work...
> >
> > Or device queue is 1000 packets, one flow, and socket send queue size
> > allows for more than 1000 packets to be 'in flight' (echo 2000000
> > >/proc/sys/net/core/wmem_default) , it wont work too with bonding, only
> > with devices with a qdisc sitting in the first device met after the
> > socket.
>
> True, thanks for pointing that out.
>
> The scenario that I am actually interested in is virtualisation.
> And I believe that your patch helps the vhostnet case (I don't see
> flow control problems with bonding + virtio without vhostnet). However,
> I am unsure if there are also some easy work-arounds to degrade
> flow control in the vhostnet case too.
Hi Eric,
do you have any thoughts on this?
I measured the performance impact of your patch on 2.6.37-rc1
and I can see why early orphaning is a win.
The tests are run over a bond with 3 slaves.
The bond is in rr-balance mode. Other parameters of interest are:
MTU=1500
client,server:tcp_reordering=3(default)
client:GSO=off,
client:TSO=off
server:GRO=off
server:rx-usecs=3(default)
Without your no early-orphan patch
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 1621.03 16.31 6.48 1.648 2.621
With your no early-orphan patch
# netperf -C -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 1433.48 9.60 5.45 1.098 2.490
However in the case of virtualisation I think it is a win to be able to do
flow control on UDP traffic from guests (using vitio). Am I missing
something and flow control can be bypassed anyway? If not perhaps making
the change that your patch makes configurable through proc or ethtool is an
option?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists