netdev - Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <OFFE63B7F4.D491BE12-ON652577D6.005B4090-652577D6.005F565A@in.ibm.com>
Date:	Tue, 9 Nov 2010 22:54:57 +0530
From:	Krishna Kumar2 <krkumar2@...ibm.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
Cc:	anthony@...emonkey.ws, arnd@...db.de, avi@...hat.com,
	davem@...emloft.net, eric.dumazet@...il.com, kvm@...r.kernel.org,
	netdev@...r.kernel.org, rusty@...tcorp.com.au
Subject: Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net

"Michael S. Tsirkin" <mst@...hat.com> wrote on 11/09/2010 09:03:25 PM:

> > > Something strange here, right?
> > > 1. You are consistently getting >10G/s here, and even with a single
> > stream?
> >
> > Sorry, I should have mentioned this though I had stated in my
> > earlier mails. Each test result has two iterations, each of 60
> > seconds, except when #netperfs is 1 for which I do 10 iteration
> > (sum across 10 iterations).
>
> So need to divide the number by 10?

Yes, that is what I get with 512/1K macvtap I/O size :)

> >  I started doing many more iterations
> > for 1 netperf after finding the issue earlier with single stream.
> > So the BW is only 4.5-7 Gbps.
> >
> > > 2. With 2 streams, is where we get < 10G/s originally. Instead of
> > >    doubling that we get a marginal improvement with 2 queues and
> > >    about 30% worse with 1 queue.
> >
> > (doubling happens consistently for guest -> host, but never for
> > remote host) I tried 512/txqs=2 and 1024/txqs=8 to get a varied
> > testing scenario. In first case, there is a slight improvement in
> > BW and good reduction in SD. In the second case, only SD improves
> > (though BW drops for 2 stream for some reason).  In both cases,
> > BW and SD improves as the number of sessions increase.
>
> I guess this is another indication that something's wrong.

The patch - both virtio-net and vhost-net, doesn't have any
locking/mutex's/ or any synchronization method. Guest -> host
performance improvement of upto 100% shows the patch is not
doing anything wrong.

> We are quite far from line rate, the fact BW does not scale
> means there's some contention in the code.

Attaining line speed with macvtap seems to be a generic issue
and unrelated to my patch specifically. IMHO if there is nothing
wrong in the code (review) and is accepted, it will benefit as
others can also help to find what needs to be implemented in
vhost/macvtap/qemu to get line speed for guest->remote-host.

PS: bare-metal performance for host->remote-host is also
    2.7 Gbps and 2.8 Gbps for 512/1024 for the same card.

Thanks,

- KK

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html