[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPpAL=xGQvpKwe8WcfHX8e59EdpZbOiohRS1qgeR4axFBDQ_+w@mail.gmail.com>
Date: Mon, 2 Sep 2024 18:03:41 +0800
From: Lei Yang <leiyang@...hat.com>
To: Dragos Tatulea <dtatulea@...dia.com>
Cc: Eugenio Perez Martin <eperezma@...hat.com>, Jason Wang <jasowang@...hat.com>,
Michael Tsirkin <mst@...hat.com>, Si-Wei Liu <si-wei.liu@...cle.com>,
virtualization@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
Gal Pressman <gal@...dia.com>, Leon Romanovsky <leon@...nel.org>, kvm@...r.kernel.org,
Parav Pandit <parav@...dia.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [PATCH vhost v2 00/10] vdpa/mlx5: Parallelize device suspend/resume
Hi Dragos
QE tested this series with mellanox nic, it failed with [1] when
booting guest, and host dmesg also will print messages [2]. This bug
can be reproduced boot guest with vhost-vdpa device.
[1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
permitted (1)
qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
qemu-kvm: vhost_set_features failed: Device or resource busy (16)
qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
[2] Host dmesg:
[ 1406.187977] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
[ 1406.189221] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
[ 1406.190354] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
[ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
a leak of a command resource
[ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
a leak of a command resource
[ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
[ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
[ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
8511) warning: failed to resume VQs
[ 1471.549778] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
[ 1512.929854] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
[ 1513.100290] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
Thanks
Lei
> This series parallelizes the mlx5_vdpa device suspend and resume
> operations through the firmware async API. The purpose is to reduce live
> migration downtime.
>
> The series starts with changing the VQ suspend and resume commands
> to the async API. After that, the switch is made to issue multiple
> commands of the same type in parallel.
>
> Then, the an additional improvement is added: keep the notifiers enabled
> during suspend but make it a NOP. Upon resume make sure that the link
> state is forwarded. This shaves around 30ms per device constant time.
>
> Finally, use parallel VQ suspend and resume during the CVQ MQ command.
>
> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> x 2 threads per core), the improvements are:
>
> +-------------------+--------+--------+-----------+
> | operation | Before | After | Reduction |
> |-------------------+--------+--------+-----------|
> | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> +-------------------+--------+--------+-----------+
>
> ---
> v2:
> - Changed to parallel VQ suspend/resume during CVQ MQ command.
> Support added in the last 2 patches.
> - Made the fw async command more generic and moved it to resources.c.
> Did that because the following series (parallel mkey ops) needs this
> code as well.
> Dropped Acked-by from Eugenio on modified patches.
> - Fixed kfree -> kvfree.
> - Removed extra newline caught during review.
> - As discussed in the v1, the series can be pulled in completely in
> the vhost tree [0]. The mlx5_core patch was reviewed by Tariq who is
> also a maintainer for mlx5_core.
>
> [0] - https://lore.kernel.org/virtualization/6582792d-8db2-4bc0-bf3a-248fe5c8fc56@nvidia.com/T/#maefabb2fde5adfb322d16ca16ae64d540f75b7d2
>
> Dragos Tatulea (10):
> net/mlx5: Support throttled commands from async API
> vdpa/mlx5: Introduce error logging function
> vdpa/mlx5: Introduce async fw command wrapper
> vdpa/mlx5: Use async API for vq query command
> vdpa/mlx5: Use async API for vq modify commands
> vdpa/mlx5: Parallelize device suspend
> vdpa/mlx5: Parallelize device resume
> vdpa/mlx5: Keep notifiers during suspend but ignore
> vdpa/mlx5: Small improvement for change_num_qps()
> vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command
>
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 22 +
> drivers/vdpa/mlx5/core/resources.c | 73 ++++
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 396 +++++++++++-------
> 4 files changed, 361 insertions(+), 151 deletions(-)
>
> --
> 2.45.1
>
Powered by blists - more mailing lists