[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f73d1b6-d240-422f-b071-358ed5902747@linux.alibaba.com>
Date: Wed, 17 Jan 2024 11:38:44 +0800
From: Heng Qi <hengqi@...ux.alibaba.com>
To: Simon Horman <horms@...nel.org>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev,
Jason Wang <jasowang@...hat.com>, "Michael S. Tsirkin" <mst@...hat.com>,
Paolo Abeni <pabeni@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH net-next 2/3] virtio-net: batch dim request
在 2024/1/17 上午3:49, Simon Horman 写道:
> On Tue, Jan 16, 2024 at 09:11:32PM +0800, Heng Qi wrote:
>> Currently, when each time the driver attempts to update the coalescing
>> parameters for a vq, it needs to kick the device.
>> The following path is observed:
>> 1. Driver kicks the device;
>> 2. After the device receives the kick, CPU scheduling occurs and DMA
>> multiple buffers multiple times;
>> 3. The device completes processing and replies with a response.
>>
>> When large-queue devices issue multiple requests and kick the device
>> frequently, this often interrupt the work of the device-side CPU.
>> In addition, each vq request is processed separately, causing more
>> delays for the CPU to wait for the DMA request to complete.
>>
>> These interruptions and overhead will strain the CPU responsible for
>> controlling the path of the DPU, especially in multi-device and
>> large-queue scenarios.
>>
>> To solve the above problems, we internally tried batch request,
>> which merges requests from multiple queues and sends them at once.
>> We conservatively tested 8 queue commands and sent them together.
>> The DPU processing efficiency can be improved by 8 times, which
>> greatly eases the DPU's support for multi-device and multi-queue DIM.
>>
>> Suggested-by: Xiaoming Zhao <zxm377917@...baba-inc.com>
>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
> ...
>
>> @@ -3546,16 +3552,32 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>> update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
>> if (update_moder.usec != rq->intr_coal.max_usecs ||
>> update_moder.pkts != rq->intr_coal.max_packets) {
>> - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
>> - update_moder.usec,
>> - update_moder.pkts);
>> - if (err)
>> - pr_debug("%s: Failed to send dim parameters on rxq%d\n",
>> - dev->name, qnum);
>> - dim->state = DIM_START_MEASURE;
>> + coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
>> + coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
>> + coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
>> + rq->intr_coal.max_usecs = update_moder.usec;
>> + rq->intr_coal.max_packets = update_moder.pkts;
>> + j++;
>> }
>> }
>>
>> + if (!j)
>> + goto ret;
>> +
>> + coal->num_entries = cpu_to_le32(j);
>> + sg_init_one(&sgs, coal, sizeof(struct virtnet_batch_coal) +
>> + j * sizeof(struct virtio_net_ctrl_coal_vq));
>> + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
>> + VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET,
>> + &sgs))
>> + dev_warn(&vi->vdev->dev, "Failed to add dim command\n.");
>> +
>> + for (i = 0; i < j; i++) {
>> + rq = &vi->rq[(coal->coal_vqs[i].vqn) / 2];
> Hi Heng Qi,
>
> The type of .vqn is __le16, but here it is used as an
> integer in host byte order. Perhaps this should be (completely untested!):
>
> rq = &vi->rq[le16_to_cpu(coal->coal_vqs[i].vqn) / 2];
Hi Simon,
Thanks for the catch, I will check this out.
>
>> + rq->dim.state = DIM_START_MEASURE;
>> + }
>> +
>> +ret:
>> rtnl_unlock();
>> }
>>
Powered by blists - more mailing lists