[<prev] [next>] [day] [month] [year] [list]
Message-ID: <OF0BDA6B3A.F673A449-ON652577BC.00422911-652577BC.0043474B@in.ibm.com>
Date: Thu, 14 Oct 2010 17:47:54 +0530
From: Krishna Kumar2 <krkumar2@...ibm.com>
To: Krishna Kumar2 <krkumar2@...ibm.com>
Cc: anthony@...emonkey.ws, arnd@...db.de, avi@...hat.com,
davem@...emloft.net, kvm@...r.kernel.org,
"Michael S. Tsirkin" <mst@...hat.com>, netdev@...r.kernel.org,
rusty@...tcorp.com.au
Subject: Re: [v2 RFC PATCH 0/4] Implement multiqueue virtio-net
Krishna Kumar2/India/IBM wrote on 10/14/2010 02:34:01 PM:
> void vhost_poll_queue(struct vhost_poll *poll)
> {
> struct vhost_virtqueue *vq = vhost_find_vq(poll);
>
> vhost_work_queue(vq, &poll->work);
> }
>
> Since poll batches packets, find_vq does not seem to add much
> to the CPU utilization (or BW). I am sure that code can be
> optimized much better.
>
> The results I sent in my last mail were without your use_mm
> patch, and the only tuning was to make vhost threads run on
> only cpus 0-3 (though the performance is good even without
> that). I will test it later today with the use_mm patch too.
There's a significant reduction in CPU/SD utilization with your
patch. Following is the performance of ORG vs MQ+mm patch:
_________________________________________________
Org vs MQ+mm patch txq=2
# BW% CPU/RCPU% SD/RSD%
_________________________________________________
1 2.26 -1.16 .27 -20.00 0
2 35.07 29.90 21.81 0 -11.11
4 55.03 84.57 37.66 26.92 -4.62
8 73.16 118.69 49.21 45.63 -.46
16 77.43 98.81 47.89 24.07 -7.80
24 71.59 105.18 48.44 62.84 18.18
32 70.91 102.38 47.15 49.22 8.54
40 63.26 90.58 41.00 85.27 37.33
48 45.25 45.99 11.23 14.31 -12.91
64 42.78 41.82 5.50 .43 -25.12
80 31.40 7.31 -18.69 15.78 -11.93
96 27.60 7.79 -18.54 17.39 -10.98
128 23.46 -11.89 -34.41 -.41 -25.53
_________________________________________________
BW: 40.2 CPU/RCPU: 29.9,-2.2 SD/RSD: 12.0,-15.6
Following is the performance of MQ vs MQ+mm patch:
_____________________________________________________
MQ vs MQ+mm patch
# BW% CPU% RCPU% SD% RSD%
_____________________________________________________
1 4.98 -.58 .84 -20.00 0
2 5.17 2.96 2.29 0 -4.00
4 -.18 .25 -.16 3.12 .98
8 -5.47 -1.36 -1.98 17.18 16.57
16 -1.90 -6.64 -3.54 -14.83 -12.12
24 -.01 23.63 14.65 57.61 46.64
32 .27 -3.19 -3.11 -22.98 -22.91
40 -1.06 -2.96 -2.96 -4.18 -4.10
48 -.28 -2.34 -3.71 -2.41 -3.81
64 9.71 33.77 30.65 81.44 77.09
80 -10.69 -31.07 -31.70 -29.22 -29.88
96 -1.14 5.98 .56 -11.57 -16.14
128 -.93 -15.60 -18.31 -19.89 -22.65
_____________________________________________________
BW: 0 CPU/RCPU: -4.2,-6.1 SD/RSD: -13.1,-15.6
_____________________________________________________
Each test case is for 60 secs, sum over two runs (except
when number of netperf sessions is 1, which has 7 runs
of 10 secs each), numcpus=4, numtxqs=8, etc. No tuning
other than taskset each vhost to cpus 0-3.
Thanks,
- KK
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists