netdev - Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101110161600.GC3891@redhat.com>
Date:	Wed, 10 Nov 2010 18:16:00 +0200
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Krishna Kumar2 <krkumar2@...ibm.com>
Cc:	anthony@...emonkey.ws, arnd@...db.de, avi@...hat.com,
	davem@...emloft.net, eric.dumazet@...il.com, kvm@...r.kernel.org,
	netdev@...r.kernel.org, rusty@...tcorp.com.au
Subject: Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net

On Tue, Nov 09, 2010 at 10:54:57PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@...hat.com> wrote on 11/09/2010 09:03:25 PM:
> 
> > > > Something strange here, right?
> > > > 1. You are consistently getting >10G/s here, and even with a single
> > > stream?
> > >
> > > Sorry, I should have mentioned this though I had stated in my
> > > earlier mails. Each test result has two iterations, each of 60
> > > seconds, except when #netperfs is 1 for which I do 10 iteration
> > > (sum across 10 iterations).
> >
> > So need to divide the number by 10?
> 
> Yes, that is what I get with 512/1K macvtap I/O size :)
> 
> > >  I started doing many more iterations
> > > for 1 netperf after finding the issue earlier with single stream.
> > > So the BW is only 4.5-7 Gbps.
> > >
> > > > 2. With 2 streams, is where we get < 10G/s originally. Instead of
> > > >    doubling that we get a marginal improvement with 2 queues and
> > > >    about 30% worse with 1 queue.
> > >
> > > (doubling happens consistently for guest -> host, but never for
> > > remote host) I tried 512/txqs=2 and 1024/txqs=8 to get a varied
> > > testing scenario. In first case, there is a slight improvement in
> > > BW and good reduction in SD. In the second case, only SD improves
> > > (though BW drops for 2 stream for some reason).  In both cases,
> > > BW and SD improves as the number of sessions increase.
> >
> > I guess this is another indication that something's wrong.
> 
> The patch - both virtio-net and vhost-net, doesn't have any
> locking/mutex's/ or any synchronization method. Guest -> host
> performance improvement of upto 100% shows the patch is not
> doing anything wrong.

My concern is this: we don't seem to do anything in tap or macvtap to
help packets from separate virtio queues get to separate queues in the
hardware device and to avoid reordering when we do this.

- skb_tx_hash calculation will get different results
- hash math that e.g. tcp does will run on guest and seems to be discarded

etc

Maybe it's as simple as some tap/macvtap ioctls to set up the queue number
in skbs. Or maybe we need to pass the skb hash from guest to host.
It's this last option that should make us especially cautios as it'll
affect guest/host interface.

Also see d5a9e24afb4ab38110ebb777588ea0bd0eacbd0a: if we have
hardware which records an RX queue, it appears important to
pass that info to guest and to use that in selecting the TX queue.
Of course we won't see this in netperf runs but this needs to
be given thought too - supporting this seems to suggest either
sticking the hash in the virtio net header for both tx and rx,
or using multiplease RX queues.

> > We are quite far from line rate, the fact BW does not scale
> > means there's some contention in the code.
> 
> Attaining line speed with macvtap seems to be a generic issue
> and unrelated to my patch specifically. IMHO if there is nothing
> wrong in the code (review) and is accepted, it will benefit as
> others can also help to find what needs to be implemented in
> vhost/macvtap/qemu to get line speed for guest->remote-host.

No problem, I will queue these patches in some branch
to help enable cooperation, as well as help you
iterate with incremental patches instead of resending it all each time.


> PS: bare-metal performance for host->remote-host is also
>     2.7 Gbps and 2.8 Gbps for 512/1024 for the same card.
> 
> Thanks,

You mean native linux BW does not scale for your host with
# of connections either? I guess this just means need another
setup for testing?

> - KK
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html