[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <78678f33-c9ba-bf85-7778-b2d0676b78dd@linux.vnet.ibm.com>
Date: Mon, 25 Sep 2017 16:18:13 -0400
From: Matthew Rosato <mjrosato@...ux.vnet.ibm.com>
To: Jason Wang <jasowang@...hat.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, mst@...hat.com
Subject: Re: Regression in throughput between kvm guests over virtual bridge
On 09/22/2017 12:03 AM, Jason Wang wrote:
>
>
> On 2017年09月21日 03:38, Matthew Rosato wrote:
>>> Seems to make some progress on wakeup mitigation. Previous patch tries
>>> to reduce the unnecessary traversal of waitqueue during rx. Attached
>>> patch goes even further which disables rx polling during processing tx.
>>> Please try it to see if it has any difference.
>> Unfortunately, this patch doesn't seem to have made a difference. I
>> tried runs with both this patch and the previous patch applied, as well
>> as only this patch applied for comparison (numbers from vhost thread of
>> sending VM):
>>
>> 4.12 4.13 patch1 patch2 patch1+2
>> 2.00% +3.69% +2.55% +2.81% +2.69% [...] __wake_up_sync_key
>>
>> In each case, the regression in throughput was still present.
>
> This probably means some other cases of the wakeups were missed. Could
> you please record the callers of __wake_up_sync_key()?
>
Hi Jason,
With your 2 previous patches applied, every call to __wake_up_sync_key
(for both sender and server vhost threads) shows the following stack trace:
vhost-11478-11520 [002] .... 312.927229: __wake_up_sync_key
<-sock_def_readable
vhost-11478-11520 [002] .... 312.927230: <stack trace>
=> dev_hard_start_xmit
=> sch_direct_xmit
=> __dev_queue_xmit
=> br_dev_queue_push_xmit
=> br_forward_finish
=> __br_forward
=> br_handle_frame_finish
=> br_handle_frame
=> __netif_receive_skb_core
=> netif_receive_skb_internal
=> tun_get_user
=> tun_sendmsg
=> handle_tx
=> vhost_worker
=> kthread
=> kernel_thread_starter
=> kernel_thread_starter
>>
>>> And two questions:
>>> - Is the issue existed if you do uperf between 2VMs (instead of 4VMs)
>> Verified that the second set of guests are not actually required, I can
>> see the regression with only 2 VMs.
>>
>>> - Can enable batching in the tap of sending VM improve the performance
>>> (ethtool -C $tap rx-frames 64)
>> I tried this, but it did not help (actually seemed to make things a
>> little worse)
>>
>
> I still can't see a reason that can lead more wakeups, will take more
> time to look at this issue and keep you posted.
>
> Thanks
>
Powered by blists - more mailing lists