[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100908104833.GJ23051@redhat.com>
Date: Wed, 8 Sep 2010 13:48:33 +0300
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Krishna Kumar2 <krkumar2@...ibm.com>
Cc: anthony@...emonkey.ws, davem@...emloft.net, kvm@...r.kernel.org,
netdev@...r.kernel.org, rusty@...tcorp.com.au, rick.jones2@...com
Subject: Re: [RFC PATCH 0/4] Implement multiqueue virtio-net
On Wed, Sep 08, 2010 at 02:53:03PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@...hat.com> wrote on 09/08/2010 01:40:11 PM:
>
> >
> _______________________________________________________________________________
>
> > > TCP (#numtxqs=2)
> > > N# BW1 BW2 (%) SD1 SD2 (%) RSD1 RSD2
> (%)
> > >
> >
> _______________________________________________________________________________
>
> > > 4 26387 40716 (54.30) 20 28 (40.00) 86i 85
> (-1.16)
> > > 8 24356 41843 (71.79) 88 129 (46.59) 372 362
> (-2.68)
> > > 16 23587 40546 (71.89) 375 564 (50.40) 1558 1519
> (-2.50)
> > > 32 22927 39490 (72.24) 1617 2171 (34.26) 6694 5722
> (-14.52)
> > > 48 23067 39238 (70.10) 3931 5170 (31.51) 15823 13552
> (-14.35)
> > > 64 22927 38750 (69.01) 7142 9914 (38.81) 28972 26173
> (-9.66)
> > > 96 22568 38520 (70.68) 16258 27844 (71.26) 65944 73031
> (10.74)
> >
> > That's a significant hit in TCP SD. Is it caused by the imbalance between
> > number of queues for TX and RX? Since you mention RX is complete,
> > maybe measure with a balanced TX/RX?
>
> Yes, I am not sure why it is so high.
Any errors at higher levels? Are any packets reordered?
> I found the same with #RX=#TX
> too. As a hack, I tried ixgbe without MQ (set "indices=1" before
> calling alloc_etherdev_mq, not sure if that is entirely correct) -
> here too SD worsened by around 40%. I can't explain it, since the
> virtio-net driver runs lock free once sch_direct_xmit gets
> HARD_TX_LOCK for the specific txq. Maybe the SD calculation is not strictly
> correct since
> more threads are now running parallel and load is higher? Eg, if you
> compare SD between
> #netperfs = 8 vs 16 for original code (cut-n-paste relevant columns
> only) ...
>
> N# BW SD
> 8 24356 88
> 16 23587 375
>
> ... SD has increased more than 4 times for the same BW.
>
> > What happens with a single netperf?
> > host -> guest performance with TCP and small packet speed
> > are also worth measuring.
>
> OK, I will do this and send the results later today.
>
> > At some level, host/guest communication is easy in that we don't really
> > care which queue is used. I would like to give some thought (and
> > testing) to how is this going to work with a real NIC card and packet
> > steering at the backend.
> > Any idea?
>
> I have done a little testing with guest -> remote server both
> using a bridge and with macvtap (mq is required only for rx).
> I didn't understand what you mean by packet steering though,
> is it whether packets go out of the NIC on different queues?
> If so, I verified that is the case by putting a counter and
> displaying through /debug interface on the host. dev_queue_xmit
> on the host handles it by calling dev_pick_tx().
>
> > > Guest interrupts for a 4 TXQ device after a 5 min test:
> > > # egrep "virtio0|CPU" /proc/interrupts
> > > CPU0 CPU1 CPU2 CPU3
> > > 40: 0 0 0 0 PCI-MSI-edge virtio0-config
> > > 41: 126955 126912 126505 126940 PCI-MSI-edge virtio0-input
> > > 42: 108583 107787 107853 107716 PCI-MSI-edge virtio0-output.0
> > > 43: 300278 297653 299378 300554 PCI-MSI-edge virtio0-output.1
> > > 44: 372607 374884 371092 372011 PCI-MSI-edge virtio0-output.2
> > > 45: 162042 162261 163623 162923 PCI-MSI-edge virtio0-output.3
> >
> > Does this mean each interrupt is constantly bouncing between CPUs?
>
> Yes. I didn't do *any* tuning for the tests. The only "tuning"
> was to use 64K IO size with netperf. When I ran default netperf
> (16K), I got a little lesser improvement in BW and worse(!) SD
> than with 64K.
>
> Thanks,
>
> - KK
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists