[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OF7E84392C.8FAA20F2-ON6525779D.00112DE2-6525779D.0016E2C5@in.ibm.com>
Date: Mon, 13 Sep 2010 09:42:22 +0530
From: Krishna Kumar2 <krkumar2@...ibm.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: anthony@...emonkey.ws, davem@...emloft.net, kvm@...r.kernel.org,
netdev@...r.kernel.org, rusty@...tcorp.com.au
Subject: Re: [RFC PATCH 0/4] Implement multiqueue virtio-net
"Michael S. Tsirkin" <mst@...hat.com> wrote on 09/12/2010 05:10:25 PM:
> > SINGLE vhost (Guest -> Host):
> > 1 netperf: BW: 10.7% SD: -1.4%
> > 4 netperfs: BW: 3% SD: 1.4%
> > 8 netperfs: BW: 17.7% SD: -10%
> > 16 netperfs: BW: 4.7% SD: -7.0%
> > 32 netperfs: BW: -6.1% SD: -5.7%
> > BW and SD both improves (guest multiple txqs help). For 32
> > netperfs, SD improves.
> >
> > But with multiple vhosts, guest is able to send more packets
> > and BW increases much more (SD too increases, but I think
> > that is expected).
>
> Why is this expected?
Results with the original kernel:
_____________________________
# BW SD RSD
______________________________
1 20903 1 6
2 21963 6 25
4 22042 23 102
8 21674 97 419
16 22281 379 1663
24 22521 857 3748
32 22976 1528 6594
40 23197 2390 10239
48 22973 3542 15074
64 23809 6486 27244
80 23564 10169 43118
96 22977 14954 62948
128 23649 27067 113892
________________________________
With higher number of threads running in parallel, SD
increased. In this case most threads run in parallel
only till __dev_xmit_skb (#numtxqs=1). With mq TX patch,
higher number of threads run in parallel through
ndo_start_xmit. I *think* the increase in SD is to do
with higher # of threads running for larger code path
>From the numbers I posted with the patch (cut-n-paste
only the % parts), BW increased much more than the SD,
sometimes more than twice the increase in SD.
N# BW% SD% RSD%
4 54.30 40.00 -1.16
8 71.79 46.59 -2.68
16 71.89 50.40 -2.50
32 72.24 34.26 -14.52
48 70.10 31.51 -14.35
64 69.01 38.81 -9.66
96 70.68 71.26 10.74
I also think SD calculation gets skewed for guest->local
host testing. For this test, I ran a guest with numtxqs=16.
The first result below is with my patch, which creates 16
vhosts. The second result is with a modified patch which
creates only 2 vhosts (testing with #netperfs = 64):
#vhosts BW% SD% RSD%
16 20.79 186.01 149.74
2 30.89 34.55 18.44
The remote SD increases with the number of vhost threads,
but that number seems to correlate with guest SD. So though
BW% increased slightly from 20% to 30%, SD fell drastically
from 186% to 34%. I think it could be a calculation skew
with host SD, which also fell from 150% to 18%.
I am planning to submit 2nd patch rev with restricted
number of vhosts.
> > Likely cause for the 1 stream degradation with multiple
> > vhost patch:
> >
> > 1. Two vhosts run handling the RX and TX respectively.
> > I think the issue is related to cache ping-pong esp
> > since these run on different cpus/sockets.
>
> Right. With TCP I think we are better off handling
> TX and RX for a socket by the same vhost, so that
> packet and its ack are handled by the same thread.
> Is this what happens with RX multiqueue patch?
> How do we select an RX queue to put the packet on?
My (unsubmitted) RX patch doesn't do this yet, that is
something I will check.
Thanks,
- KK
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists