[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <1261163601.9365.82.camel@w-sridhar.beaverton.ibm.com>
Date: Fri, 18 Dec 2009 11:13:21 -0800
From: Sridhar Samudrala <sri@...ibm.com>
To: Krishna Kumar2 <krkumar2@...ibm.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Herbert Xu <herbert@...dor.apana.org.au>,
Jarek Poplawski <jarkao2@...il.com>, mst@...hat.com,
netdev@...r.kernel.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with
vhost-net
On Fri, 2009-12-18 at 19:16 +0530, Krishna Kumar2 wrote:
> >
> > 2.6.32 + Rusty's xmit_napi v2 patch + don't stop early & drop skb onfail
> patch
> >
> -------------------------------------------------------------------------------
>
> > $./netperf -c -C -H 192.168.122.1 -t TCP_STREAM -l60
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.
> > 122.1 (192.168.122.1) port 0 AF_INET
> > Recv Send Send Utilization Service
> Demand
> > Socket Socket Message Elapsed Send Recv Send
> Recv
> > Size Size Size Time Throughput local remote local
> remote
> > bytes bytes bytes secs. 10^6bits/s % S % S us/KB
> us/KB
> >
> > 87380 16384 16384 60.03 7741.65 70.09 72.84 0.742
> 1.542
> > [sridhar@...alhost ~]$ tc -s qdisc show dev eth0
> > qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0
> 1__dev_xmit_skb
> > 1 1 1 1 1 1 1
> > Sent 58149531018 bytes 897991 pkt (dropped 0, overlimits 0 requeues 1)
> > rate 0bit 0pps backlog 0b 0p requeues 1
>
> Is the "drop skb" patch doing:
> - return NETDEV_TX_BUSY;
> +
> + /* drop the skb under stress. */
> + vi->dev->stats.tx_dropped++;
> + kfree_skb(skb);
> + return NETDEV_TX_OK;
Yes. This is the patch i used with plain 2.6.32. But with Rusty's patch,
i also commented out the if condition that stops the queue early in
start_xmit().
>
> Why is dropped count zero in the last test case?
The dropped count reported by 'tc' are drops at the qdisc level and are
counted via qdisc_drop(). The drops at the driver level are counted as
net_device stats and are reported by ip -s link command. I see a few drops(5-6)
in a 60sec run with 2.6.31 kernel.
>
> sch_direct_xmit is called from two places, and if it finds
> the txq stopped, it was called from __dev_xmit_skb (where
> the previous sucessful xmit had stopped the queue). This
> means the device is still stopping and restarting 1000's
> of times a min, and each restart fills up the device h/w
> queue with the backlogged skbs resulting in another stop.
> Isn't the txqlen set to 1000 in ether_setup? Can you
> increase the restart limit to a really high value, like
> 1/2 or 3/4th of the queue should be empty? Another thing
> to test is to simultaneously set txqueuelen to a big value.
txqueuelen limits the qdisc queue, not the device transmit queue.
The device tx queue length is set by qemu and defaults to 256 for
virtio-net. So a reasonable wakeup threshhold could be 64/128 and
it does reduce the number of requeues.
>
> Requeue does not seem to be the reason for BW drop since
> it barely improved when requeue's reduced from 340K to 40K.
> So, as Jarek suggested, GSO could be reason. You could try
> testing with 64K I/O size (with GSO enabled) to get
> comparable results.
Yes. with 64K messages, i am getting comparable thruput, in fact
slightly better although cpu utilization is higher. So it looks like
the better thruput with 2.6.31 kernel with 16K message size is a
side-effect of the drops.
I think Rusty's patch with 1/4 of tx ring as wakeup threshold is the first
step to address the queue full warnings in 2.6.32. With further tuning
it may be possible to eliminate the requeues.
2.6.32 + Rusty's xmit_napi_v2 patch
$ ./netperf -c -C -H 192.168.122.1 -t TCP_STREAM -l60 -- -m 65536
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.1 (192.168.122.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 65536 60.03 8200.80 92.52 91.63 0.924 1.831
$ tc -s qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 61613336514 bytes 1208233 pkt (dropped 0, overlimits 0 requeues 237750)
rate 0bit 0pps backlog 0b 0p requeues 237750
$ ip -s link show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 54:52:00:35:e3:74 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
59348763 899170 0 0 0 0
TX: bytes packets errors dropped carrier collsns
1483793932 1208230 0 0 0 0
Thanks
Sridhar
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists