[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f2a98f3a-a5c5-b762-8ec3-119a7708795d@redhat.com>
Date: Tue, 11 Dec 2018 11:06:43 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Tonghao Zhang <xiangxia.m.yue@...il.com>
Subject: Re: [PATCH net 2/4] vhost_net: rework on the lock ordering for busy
polling
On 2018/12/11 上午9:34, Michael S. Tsirkin wrote:
> On Mon, Dec 10, 2018 at 05:44:52PM +0800, Jason Wang wrote:
>> When we try to do rx busy polling in tx path in commit 441abde4cd84
>> ("net: vhost: add rx busy polling in tx path"), we lock rx vq mutex
>> after tx vq mutex is held. This may lead deadlock so we try to lock vq
>> one by one in commit 78139c94dc8c ("net: vhost: lock the vqs one by
>> one"). With this commit, we avoid the deadlock with the assumption
>> that handle_rx() and handle_tx() run in a same process. But this
>> commit remove the protection for IOTLB updating which requires the
>> mutex of each vq to be held.
>>
>> To solve this issue, the first step is to have a exact same lock
>> ordering for vhost_net. This is done through:
>>
>> - For handle_rx(), if busy polling is enabled, lock tx vq immediately.
>> - For handle_tx(), always lock rx vq before tx vq, and unlock it if
>> busy polling is not enabled.
>> - Remove the tricky locking codes in busy polling.
>>
>> With this, we can have a exact same lock ordering for vhost_net, this
>> allows us to safely revert commit 78139c94dc8c ("net: vhost: lock the
>> vqs one by one") in next patch.
>>
>> The patch will add two more atomic operations on the tx path during
>> each round of handle_tx(). 1 byte TCP_RR does not notice such
>> overhead.
>>
>> Fixes: commit 78139c94dc8c ("net: vhost: lock the vqs one by one")
>> Cc: Tonghao Zhang<xiangxia.m.yue@...il.com>
>> Signed-off-by: Jason Wang<jasowang@...hat.com>
>> ---
>> drivers/vhost/net.c | 18 +++++++++++++++---
>> 1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index ab11b2bee273..5f272ab4d5b4 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -513,7 +513,6 @@ static void vhost_net_busy_poll(struct vhost_net *net,
>> struct socket *sock;
>> struct vhost_virtqueue *vq = poll_rx ? tvq : rvq;
>>
>> - mutex_lock_nested(&vq->mutex, poll_rx ? VHOST_NET_VQ_TX: VHOST_NET_VQ_RX);
>> vhost_disable_notify(&net->dev, vq);
>> sock = rvq->private_data;
>>
>> @@ -543,8 +542,6 @@ static void vhost_net_busy_poll(struct vhost_net *net,
>> vhost_net_busy_poll_try_queue(net, vq);
>> else if (!poll_rx) /* On tx here, sock has no rx data. */
>> vhost_enable_notify(&net->dev, rvq);
>> -
>> - mutex_unlock(&vq->mutex);
>> }
>>
>> static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>> @@ -913,10 +910,16 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>> static void handle_tx(struct vhost_net *net)
>> {
>> struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
>> + struct vhost_net_virtqueue *nvq_rx = &net->vqs[VHOST_NET_VQ_RX];
>> struct vhost_virtqueue *vq = &nvq->vq;
>> + struct vhost_virtqueue *vq_rx = &nvq_rx->vq;
>> struct socket *sock;
>>
>> + mutex_lock_nested(&vq_rx->mutex, VHOST_NET_VQ_RX);
>> mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_TX);
>> + if (!vq->busyloop_timeout)
>> + mutex_unlock(&vq_rx->mutex);
>> +
>> sock = vq->private_data;
>> if (!sock)
>> goto out;
>> @@ -933,6 +936,8 @@ static void handle_tx(struct vhost_net *net)
>> handle_tx_copy(net, sock);
>>
>> out:
>> + if (vq->busyloop_timeout)
>> + mutex_unlock(&vq_rx->mutex);
>> mutex_unlock(&vq->mutex);
>> }
>>
> So rx mutex taken on tx path now. And tx mutex is on rc path ... This
> is just messed up. Why can't tx polling drop rx lock before
> getting the tx lock and vice versa?
Because we want to poll both tx and rx virtqueue at the same time
(vhost_net_busy_poll()).
while (vhost_can_busy_poll(endtime)) {
if (vhost_has_work(&net->dev)) {
*busyloop_intr = true;
break;
}
if ((sock_has_rx_data(sock) &&
!vhost_vq_avail_empty(&net->dev, rvq)) ||
!vhost_vq_avail_empty(&net->dev, tvq))
break;
cpu_relax();
}
And we disable kicks and notification for better performance.
>
> Or if we really wanted to force everything to be locked at
> all times, let's just use a single mutex.
>
>
>
We could, but it might requires more changes which could be done for
-next I believe.
Thanks
Powered by blists - more mailing lists