[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240116194912.GE588419@kernel.org>
Date: Tue, 16 Jan 2024 19:49:12 +0000
From: Simon Horman <horms@...nel.org>
To: Heng Qi <hengqi@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev,
Jason Wang <jasowang@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Paolo Abeni <pabeni@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH net-next 2/3] virtio-net: batch dim request
On Tue, Jan 16, 2024 at 09:11:32PM +0800, Heng Qi wrote:
> Currently, when each time the driver attempts to update the coalescing
> parameters for a vq, it needs to kick the device.
> The following path is observed:
> 1. Driver kicks the device;
> 2. After the device receives the kick, CPU scheduling occurs and DMA
> multiple buffers multiple times;
> 3. The device completes processing and replies with a response.
>
> When large-queue devices issue multiple requests and kick the device
> frequently, this often interrupt the work of the device-side CPU.
> In addition, each vq request is processed separately, causing more
> delays for the CPU to wait for the DMA request to complete.
>
> These interruptions and overhead will strain the CPU responsible for
> controlling the path of the DPU, especially in multi-device and
> large-queue scenarios.
>
> To solve the above problems, we internally tried batch request,
> which merges requests from multiple queues and sends them at once.
> We conservatively tested 8 queue commands and sent them together.
> The DPU processing efficiency can be improved by 8 times, which
> greatly eases the DPU's support for multi-device and multi-queue DIM.
>
> Suggested-by: Xiaoming Zhao <zxm377917@...baba-inc.com>
> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
...
> @@ -3546,16 +3552,32 @@ static void virtnet_rx_dim_work(struct work_struct *work)
> update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> if (update_moder.usec != rq->intr_coal.max_usecs ||
> update_moder.pkts != rq->intr_coal.max_packets) {
> - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> - update_moder.usec,
> - update_moder.pkts);
> - if (err)
> - pr_debug("%s: Failed to send dim parameters on rxq%d\n",
> - dev->name, qnum);
> - dim->state = DIM_START_MEASURE;
> + coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> + coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
> + coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
> + rq->intr_coal.max_usecs = update_moder.usec;
> + rq->intr_coal.max_packets = update_moder.pkts;
> + j++;
> }
> }
>
> + if (!j)
> + goto ret;
> +
> + coal->num_entries = cpu_to_le32(j);
> + sg_init_one(&sgs, coal, sizeof(struct virtnet_batch_coal) +
> + j * sizeof(struct virtio_net_ctrl_coal_vq));
> + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> + VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET,
> + &sgs))
> + dev_warn(&vi->vdev->dev, "Failed to add dim command\n.");
> +
> + for (i = 0; i < j; i++) {
> + rq = &vi->rq[(coal->coal_vqs[i].vqn) / 2];
Hi Heng Qi,
The type of .vqn is __le16, but here it is used as an
integer in host byte order. Perhaps this should be (completely untested!):
rq = &vi->rq[le16_to_cpu(coal->coal_vqs[i].vqn) / 2];
> + rq->dim.state = DIM_START_MEASURE;
> + }
> +
> +ret:
> rtnl_unlock();
> }
>
Powered by blists - more mailing lists