netdev - Re: Regression in throughput between kvm guests over virtual bridge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <15abafa1-6d58-cd85-668a-bf361a296f52@redhat.com>
Date:   Fri, 15 Sep 2017 16:55:40 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Matthew Rosato <mjrosato@...ux.vnet.ibm.com>,
        netdev@...r.kernel.org
Cc:     davem@...emloft.net, mst@...hat.com
Subject: Re: Regression in throughput between kvm guests over virtual bridge



On 2017年09月15日 11:36, Matthew Rosato wrote:
>> Is the issue gone if you reduce VHOST_RX_BATCH to 1? And it would be
>> also helpful to collect perf diff to see if anything interesting.
>> (Consider 4.4 shows more obvious regression, please use 4.4).
>>
> Issue still exists when I force VHOST_RX_BATCH = 1

Interesting, so this looks more like an issue of the changes in 
vhost_net instead of batch dequeuing itself. I try this on Intel but 
still can't meet it.

>
> Collected perf data, with 4.12 as the baseline, 4.13 as delta1 and
> 4.13+VHOST_RX_BATCH=1 as delta2. All guests running 4.4.  Same scenario,
> 2 uperf client guests, 2 uperf slave guests - I collected perf data
> against 1 uperf client process and 1 uperf slave process.  Here are the
> significant diffs:
>
> uperf client:
>
> 75.09%   +9.32%   +8.52%  [kernel.kallsyms]   [k] enabled_wait
>   9.04%   -4.11%   -3.79%  [kernel.kallsyms]   [k] __copy_from_user
>   2.30%   -0.79%   -0.71%  [kernel.kallsyms]   [k] arch_free_page
>   2.17%   -0.65%   -0.58%  [kernel.kallsyms]   [k] arch_alloc_page
>   0.69%   -0.25%   -0.24%  [kernel.kallsyms]   [k] get_page_from_freelist
>   0.56%   +0.08%   +0.14%  [kernel.kallsyms]   [k] virtio_ccw_kvm_notify
>   0.42%   -0.11%   -0.09%  [kernel.kallsyms]   [k] tcp_sendmsg
>   0.31%   -0.15%   -0.14%  [kernel.kallsyms]   [k] tcp_write_xmit
>
> uperf slave:
>
> 72.44%   +8.99%   +8.85%  [kernel.kallsyms]   [k] enabled_wait
>   8.99%   -3.67%   -3.51%  [kernel.kallsyms]   [k] __copy_to_user
>   2.31%   -0.71%   -0.67%  [kernel.kallsyms]   [k] arch_free_page
>   2.16%   -0.67%   -0.63%  [kernel.kallsyms]   [k] arch_alloc_page
>   0.89%   -0.14%   -0.11%  [kernel.kallsyms]   [k] virtio_ccw_kvm_notify
>   0.71%   -0.30%   -0.30%  [kernel.kallsyms]   [k] get_page_from_freelist
>   0.70%   -0.25%   -0.29%  [kernel.kallsyms]   [k] __wake_up_sync_key
>   0.61%   -0.22%   -0.22%  [kernel.kallsyms]   [k] virtqueue_add_inbuf

It looks like vhost is slowed down for some reason which leads to more 
idle time on 4.13+VHOST_RX_BATCH=1. Appreciated if you can collect the 
perf.diff on host, one for rx and one for tx.

>
>
>> May worth to try disable zerocopy or do the test form host to guest
>> instead of guest to guest to exclude the possible issue of sender.
>>
> With zerocopy disabled, still seeing the regression.  The provided perf
> #s have zerocopy enabled.
>
> I replaced 1 uperf guest and instead ran that uperf client as a host
> process, pointing at a guest.  All traffic still over the virtual
> bridge.  In this setup, it's still easy to see the regression for the
> remaining guest1<->guest2 uperf run, but the host<->guest3 run does NOT
> exhibit a reliable regression pattern.  The significant perf diffs from
> the host uperf process (baseline=4.12, delta=4.13):
>
>
> 59.96%   +5.03%  [kernel.kallsyms]           [k] enabled_wait
>   6.47%   -2.27%  [kernel.kallsyms]           [k] raw_copy_to_user
>   5.52%   -1.63%  [kernel.kallsyms]           [k] raw_copy_from_user
>   0.87%   -0.30%  [kernel.kallsyms]           [k] get_page_from_freelist
>   0.69%   +0.30%  [kernel.kallsyms]           [k] finish_task_switch
>   0.66%   -0.15%  [kernel.kallsyms]           [k] swake_up
>   0.58%   -0.00%  [vhost]                     [k] vhost_get_vq_desc
>     ...
>   0.42%   +0.50%  [kernel.kallsyms]           [k] ckc_irq_pending

Another hint to perf vhost threads.

>
> I also tried flipping the uperf stream around (a guest uperf client is
> communicating to a slave uperf process on the host) and also cannot see
> the regression pattern.  So it seems to require a guest on both ends of
> the connection.
>

Yes. Will try to get a s390 environment.

Thanks