netdev - Re: [PATCH net 2/4] vhost_net: rework on the lock ordering for busy polling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f2a98f3a-a5c5-b762-8ec3-119a7708795d@redhat.com>
Date:   Tue, 11 Dec 2018 11:06:43 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Tonghao Zhang <xiangxia.m.yue@...il.com>
Subject: Re: [PATCH net 2/4] vhost_net: rework on the lock ordering for busy
 polling


On 2018/12/11 上午9:34, Michael S. Tsirkin wrote:
> On Mon, Dec 10, 2018 at 05:44:52PM +0800, Jason Wang wrote:
>> When we try to do rx busy polling in tx path in commit 441abde4cd84
>> ("net: vhost: add rx busy polling in tx path"), we lock rx vq mutex
>> after tx vq mutex is held. This may lead deadlock so we try to lock vq
>> one by one in commit 78139c94dc8c ("net: vhost: lock the vqs one by
>> one"). With this commit, we avoid the deadlock with the assumption
>> that handle_rx() and handle_tx() run in a same process. But this
>> commit remove the protection for IOTLB updating which requires the
>> mutex of each vq to be held.
>>
>> To solve this issue, the first step is to have a exact same lock
>> ordering for vhost_net. This is done through:
>>
>> - For handle_rx(), if busy polling is enabled, lock tx vq immediately.
>> - For handle_tx(), always lock rx vq before tx vq, and unlock it if
>>    busy polling is not enabled.
>> - Remove the tricky locking codes in busy polling.
>>
>> With this, we can have a exact same lock ordering for vhost_net, this
>> allows us to safely revert commit 78139c94dc8c ("net: vhost: lock the
>> vqs one by one") in next patch.
>>
>> The patch will add two more atomic operations on the tx path during
>> each round of handle_tx(). 1 byte TCP_RR does not notice such
>> overhead.
>>
>> Fixes: commit 78139c94dc8c ("net: vhost: lock the vqs one by one")
>> Cc: Tonghao Zhang<xiangxia.m.yue@...il.com>
>> Signed-off-by: Jason Wang<jasowang@...hat.com>
>> ---
>>   drivers/vhost/net.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index ab11b2bee273..5f272ab4d5b4 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -513,7 +513,6 @@ static void vhost_net_busy_poll(struct vhost_net *net,
>>   	struct socket *sock;
>>   	struct vhost_virtqueue *vq = poll_rx ? tvq : rvq;
>>   
>> -	mutex_lock_nested(&vq->mutex, poll_rx ? VHOST_NET_VQ_TX: VHOST_NET_VQ_RX);
>>   	vhost_disable_notify(&net->dev, vq);
>>   	sock = rvq->private_data;
>>   
>> @@ -543,8 +542,6 @@ static void vhost_net_busy_poll(struct vhost_net *net,
>>   		vhost_net_busy_poll_try_queue(net, vq);
>>   	else if (!poll_rx) /* On tx here, sock has no rx data. */
>>   		vhost_enable_notify(&net->dev, rvq);
>> -
>> -	mutex_unlock(&vq->mutex);
>>   }
>>   
>>   static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>> @@ -913,10 +910,16 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>>   static void handle_tx(struct vhost_net *net)
>>   {
>>   	struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
>> +	struct vhost_net_virtqueue *nvq_rx = &net->vqs[VHOST_NET_VQ_RX];
>>   	struct vhost_virtqueue *vq = &nvq->vq;
>> +	struct vhost_virtqueue *vq_rx = &nvq_rx->vq;
>>   	struct socket *sock;
>>   
>> +	mutex_lock_nested(&vq_rx->mutex, VHOST_NET_VQ_RX);
>>   	mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_TX);
>> +	if (!vq->busyloop_timeout)
>> +		mutex_unlock(&vq_rx->mutex);
>> +
>>   	sock = vq->private_data;
>>   	if (!sock)
>>   		goto out;
>> @@ -933,6 +936,8 @@ static void handle_tx(struct vhost_net *net)
>>   		handle_tx_copy(net, sock);
>>   
>>   out:
>> +	if (vq->busyloop_timeout)
>> +		mutex_unlock(&vq_rx->mutex);
>>   	mutex_unlock(&vq->mutex);
>>   }
>>   
> So rx mutex taken on tx path now.  And tx mutex is on rc path ...  This
> is just messed up. Why can't tx polling drop rx lock before
> getting the tx lock and vice versa?


Because we want to poll both tx and rx virtqueue at the same time 
(vhost_net_busy_poll()).

     while (vhost_can_busy_poll(endtime)) {
         if (vhost_has_work(&net->dev)) {
             *busyloop_intr = true;
             break;
         }

         if ((sock_has_rx_data(sock) &&
              !vhost_vq_avail_empty(&net->dev, rvq)) ||
             !vhost_vq_avail_empty(&net->dev, tvq))
             break;

         cpu_relax();

     }


And we disable kicks and notification for better performance.


>
> Or if we really wanted to force everything to be locked at
> all times, let's just use a single mutex.
>
>
>

We could, but it might requires more changes which could be done for 
-next I believe.


Thanks