[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1611b26f-0997-3b22-95f5-debf57b7be8c@linux.vnet.ibm.com>
Date: Tue, 7 Nov 2017 20:02:48 -0500
From: Matthew Rosato <mjrosato@...ux.vnet.ibm.com>
To: Wei Xu <wexu@...hat.com>
Cc: Jason Wang <jasowang@...hat.com>, mst@...hat.com,
netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: Regression in throughput between kvm guests over virtual bridge
On 11/04/2017 07:35 PM, Wei Xu wrote:
> On Fri, Nov 03, 2017 at 12:30:12AM -0400, Matthew Rosato wrote:
>> On 10/31/2017 03:07 AM, Wei Xu wrote:
>>> On Thu, Oct 26, 2017 at 01:53:12PM -0400, Matthew Rosato wrote:
>>>>
>>>>>
>>>>> Are you using the same binding as mentioned in previous mail sent by you? it
>>>>> might be caused by cpu convention between pktgen and vhost, could you please
>>>>> try to run pktgen from another idle cpu by adjusting the binding?
>>>>
>>>> I don't think that's the case -- I can cause pktgen to hang in the guest
>>>> without any cpu binding, and with vhost disabled even.
>>>
>>> Yes, I did a test and it also hangs in guest, before we figure it out,
>>> maybe you try udp with uperf with this case?
>>>
>>> VM -> Host
>>> Host -> VM
>>> VM -> VM
>>>
>>
>> Here are averaged run numbers (Gbps throughput) across 4.12, 4.13 and
>> net-next with and without Jason's recent "vhost_net: conditionally
>> enable tx polling" applied (referred to as 'patch' below). 1 uperf
>> instance in each case:
>
> Thanks a lot for the test.
>
>>
>> uperf TCP:
>> 4.12 4.13 4.13+patch net-next net-next+patch
>> ----------------------------------------------------------------------
>> VM->VM 35.2 16.5 20.84 22.2 24.36
>
> Are you using the same server/test suite? You mentioned the number was around
> 28Gb for 4.12 and it dropped about 40% for 4.13, it seems thing changed, are
> there any options for performance tuning on the server to maximize the cpu
> utilization?
I experience some volatility as I am running on 1 of multiple LPARs
available to this system (they are sharing physical resources). But I
think the real issue was that I left my guest environment set to 4
vcpus, but was binding assuming there was 1 vcpu (was working on
something else, forgot to change back). This likely tainted my most
recent results, sorry.
>
> I had similar experience on x86 server and desktop before and it made that
> the result number always went up and down pretty much.
>
>> VM->Host 42.15 43.57 44.90 30.83 32.26
>> Host->VM 53.17 41.51 42.18 37.05 37.30
>
> This is a bit odd, I remember you said there was no regression while
> testing Host>VM, wasn't it?
>
>>
>> uperf UDP:
>> 4.12 4.13 4.13+patch net-next net-next+patch
>> ----------------------------------------------------------------------
>> VM->VM 24.93 21.63 25.09 8.86 9.62
>> VM->Host 40.21 38.21 39.72 8.74 9.35
>> Host->VM 31.26 30.18 31.25 7.2 9.26
>
> This case should be quite similar with pkgten, if you got improvement with
> pktgen, usually it was also the same for UDP, could you please try to disable
> tso, gso, gro, ufo on all host tap devices and guest virtio-net devices? Currently
> the most significant tests would be like this AFAICT:
>
> Host->VM 4.12 4.13
> TCP:
> UDP:
> pktgen:
>
> Don't want to bother you too much, so maybe 4.12 & 4.13 without Jason's patch should
> work since we have seen positive number for that, you can also temporarily skip
> net-next as well.
Here are the requested numbers, averaged over numerous runs -- guest is
4GB+1vcpu, host uperf/pktgen bound to 1 host CPU + qemu and vhost thread
pinned to other unique host CPUs. tso, gso, gro, ufo disabled on host
taps / guest virtio-net devs as requested:
Host->VM 4.12 4.13
TCP: 9.92Gb/s 6.44Gb/s
UDP: 5.77Gb/s 6.63Gb/s
pktgen: 1572403pps 1904265pps
UDP/pktgen both show improvement from 4.12->4.13. More interesting,
however, is that I am seeing the TCP regression for the first time from
host->VM. I wonder if the combination of CPU binding + disabling of one
or more of tso/gso/gro/ufo is related.
>
> If you see UDP and pktgen are aligned, then it might be helpful to continue
> the other two cases, otherwise we fail in the first place.
I will start gathering those numbers tomorrow.
>
>> The net is that Jason's recent patch definitely improves things across
>> the board at 4.13 as well as at net-next -- But the VM<->VM TCP numbers
>> I am observing are still lower than base 4.12.
>
> Cool.
>
>>
>> A separate concern is why my UDP numbers look so bad on net-next (have
>> not bisected this yet).
>
> This might be another issue, I am in vacation, will try it on x86 once back
> to work on next Wednesday.
>
> Wei
>
>>
>
Powered by blists - more mailing lists