netdev - Re: Regression in throughput between kvm guests over virtual bridge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171112154002.kzk7vz7mh4ws3odg@Wei-Dev>
Date:   Sun, 12 Nov 2017 23:40:03 +0800
From:   Wei Xu <wexu@...hat.com>
To:     Matthew Rosato <mjrosato@...ux.vnet.ibm.com>
Cc:     Jason Wang <jasowang@...hat.com>, mst@...hat.com,
        netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: Regression in throughput between kvm guests over virtual bridge

On Tue, Nov 07, 2017 at 08:02:48PM -0500, Matthew Rosato wrote:
> On 11/04/2017 07:35 PM, Wei Xu wrote:
> > On Fri, Nov 03, 2017 at 12:30:12AM -0400, Matthew Rosato wrote:
> >> On 10/31/2017 03:07 AM, Wei Xu wrote:
> >>> On Thu, Oct 26, 2017 at 01:53:12PM -0400, Matthew Rosato wrote:
> >>>>
> >>>>>
> >>>>> Are you using the same binding as mentioned in previous mail sent by you? it
> >>>>> might be caused by cpu convention between pktgen and vhost, could you please
> >>>>> try to run pktgen from another idle cpu by adjusting the binding? 
> >>>>
> >>>> I don't think that's the case -- I can cause pktgen to hang in the guest
> >>>> without any cpu binding, and with vhost disabled even.
> >>>
> >>> Yes, I did a test and it also hangs in guest, before we figure it out,
> >>> maybe you try udp with uperf with this case?
> >>>
> >>> VM   -> Host
> >>> Host -> VM
> >>> VM   -> VM
> >>>
> >>
> >> Here are averaged run numbers (Gbps throughput) across 4.12, 4.13 and
> >> net-next with and without Jason's recent "vhost_net: conditionally
> >> enable tx polling" applied (referred to as 'patch' below).  1 uperf
> >> instance in each case:
> > 
> > Thanks a lot for the test. 
> > 
> >>
> >> uperf TCP:
> >> 	 4.12	4.13	4.13+patch	net-next	net-next+patch
> >> ----------------------------------------------------------------------
> >> VM->VM	 35.2	16.5	20.84		22.2		24.36
> > 
> > Are you using the same server/test suite? You mentioned the number was around 
> > 28Gb for 4.12 and it dropped about 40% for 4.13, it seems thing changed, are
> > there any options for performance tuning on the server to maximize the cpu
> > utilization? 
> 
> I experience some volatility as I am running on 1 of multiple LPARs
> available to this system (they are sharing physical resources).  But I
> think the real issue was that I left my guest environment set to 4
> vcpus, but was binding assuming there was 1 vcpu (was working on
> something else, forgot to change back).  This likely tainted my most
> recent results, sorry.

Not a problem at all, also thanks for the feedback. :)

> 
> > 
> > I had similar experience on x86 server and desktop before and it made that
> > the result number always went up and down pretty much.
> > 
> >> VM->Host 42.15	43.57	44.90		30.83		32.26
> >> Host->VM 53.17	41.51	42.18		37.05		37.30
> > 
> > This is a bit odd, I remember you said there was no regression while 
> > testing Host>VM, wasn't it? 
> > 
> >>
> >> uperf UDP:
> >> 	 4.12	4.13	4.13+patch	net-next	net-next+patch
> >> ----------------------------------------------------------------------
> >> VM->VM	 24.93	21.63	25.09		8.86		9.62
> >> VM->Host 40.21	38.21	39.72		8.74		9.35
> >> Host->VM 31.26	30.18	31.25		7.2		9.26
> > 
> > This case should be quite similar with pkgten, if you got improvement with
> > pktgen, usually it was also the same for UDP, could you please try to disable
> > tso, gso, gro, ufo on all host tap devices and guest virtio-net devices? Currently
> > the most significant tests would be like this AFAICT:
> > 
> > Host->VM     4.12    4.13
> >  TCP:
> >  UDP:
> > pktgen:
> > 
> > Don't want to bother you too much, so maybe 4.12 & 4.13 without Jason's patch should
> > work since we have seen positive number for that, you can also temporarily skip
> > net-next as well.
> 
> Here are the requested numbers, averaged over numerous runs --  guest is
> 4GB+1vcpu, host uperf/pktgen bound to 1 host CPU + qemu and vhost thread
> pinned to other unique host CPUs.  tso, gso, gro, ufo disabled on host
> taps / guest virtio-net devs as requested:
> 
> Host->VM	4.12		4.13
> TCP:		9.92Gb/s	6.44Gb/s
> UDP:		5.77Gb/s	6.63Gb/s
> pktgen:		1572403pps	1904265pps
> 
> UDP/pktgen both show improvement from 4.12->4.13.  More interesting,
> however, is that I am seeing the TCP regression for the first time from
> host->VM.  I wonder if the combination of CPU binding + disabling of one
> or more of tso/gso/gro/ufo is related.

Interesting, then maybe we can address the regression based on this case first
if we can reproduce it. Can you have a look at TCP statistics difference on
both host and guest side with 'netstat -s' between tests? 

Wei

> 
> > 
> > If you see UDP and pktgen are aligned, then it might be helpful to continue
> > the other two cases, otherwise we fail in the first place.
> 
> I will start gathering those numbers tomorrow.
> 
> > 
> >> The net is that Jason's recent patch definitely improves things across
> >> the board at 4.13 as well as at net-next -- But the VM<->VM TCP numbers
> >> I am observing are still lower than base 4.12.
> > 
> > Cool.
> > 
> >>
> >> A separate concern is why my UDP numbers look so bad on net-next (have
> >> not bisected this yet).
> > 
> > This might be another issue, I am in vacation, will try it on x86 once back
> > to work on next Wednesday.
> > 
> > Wei
> > 
> >>
> > 
>