[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f0ef4ebd801a35873561384b2aedc920faecd03.camel@nvidia.com>
Date: Thu, 5 Oct 2023 12:16:12 +0000
From: Dragos Tatulea <dtatulea@...dia.com>
To: "eperezma@...hat.com" <eperezma@...hat.com>
CC: "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"xuanzhuo@...ux.alibaba.com" <xuanzhuo@...ux.alibaba.com>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
Gal Pressman <gal@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"jasowang@...hat.com" <jasowang@...hat.com>,
"leon@...nel.org" <leon@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>,
"mst@...hat.com" <mst@...hat.com>
Subject: Re: [PATCH vhost 14/16] vdpa/mlx5: Enable hw support for vq
descriptor mapping
On Thu, 2023-10-05 at 11:42 +0200, Eugenio Perez Martin wrote:
> On Thu, Sep 28, 2023 at 6:50 PM Dragos Tatulea <dtatulea@...dia.com> wrote:
> >
> > Vq descriptor mappings are supported in hardware by filling in an
> > additional mkey which contains the descriptor mappings to the hw vq.
> >
> > A previous patch in this series added support for hw mkey (mr) creation
> > for ASID 1.
> >
> > This patch fills in both the vq data and vq descriptor mkeys based on
> > group ASID mapping.
> >
> > The feature is signaled to the vdpa core through the presence of the
> > .get_vq_desc_group op.
> >
> > Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
> > ---
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 26 ++++++++++++++++++++++++--
> > include/linux/mlx5/mlx5_ifc_vdpa.h | 7 ++++++-
> > 2 files changed, 30 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 25bd2c324f5b..46441e41892c 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -823,6 +823,7 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev,
> > struct mlx5_vdpa_virtque
> > u32 out[MLX5_ST_SZ_DW(create_virtio_net_q_out)] = {};
> > struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
> > struct mlx5_vdpa_mr *vq_mr;
> > + struct mlx5_vdpa_mr *vq_desc_mr;
> > void *obj_context;
> > u16 mlx_features;
> > void *cmd_hdr;
> > @@ -878,6 +879,11 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev,
> > struct mlx5_vdpa_virtque
> > vq_mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
> > if (vq_mr)
> > MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, vq_mr->mkey);
> > +
> > + vq_desc_mr = mvdev->mr[mvdev-
> > >group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
> > + if (vq_desc_mr)
> > + MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, vq_desc_mr-
> > >mkey);
> > +
> > MLX5_SET(virtio_q, vq_ctx, umem_1_id, mvq->umem1.id);
> > MLX5_SET(virtio_q, vq_ctx, umem_1_size, mvq->umem1.size);
> > MLX5_SET(virtio_q, vq_ctx, umem_2_id, mvq->umem2.id);
> > @@ -2265,6 +2271,16 @@ static u32 mlx5_vdpa_get_vq_group(struct vdpa_device
> > *vdev, u16 idx)
> > return MLX5_VDPA_DATAVQ_GROUP;
> > }
> >
> > +static u32 mlx5_vdpa_get_vq_desc_group(struct vdpa_device *vdev, u16 idx)
> > +{
> > + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > +
> > + if (is_ctrl_vq_idx(mvdev, idx))
> > + return MLX5_VDPA_CVQ_GROUP;
> > +
> > + return MLX5_VDPA_DATAVQ_DESC_GROUP;
> > +}
> > +
> > static u64 mlx_to_vritio_features(u16 dev_features)
> > {
> > u64 result = 0;
> > @@ -3139,7 +3155,7 @@ static int mlx5_set_group_asid(struct vdpa_device
> > *vdev, u32 group,
> > {
> > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> >
> > - if (group >= MLX5_VDPA_NUMVQ_GROUPS)
> > + if (group >= MLX5_VDPA_NUMVQ_GROUPS || asid >= MLX5_VDPA_NUM_AS)
>
> Nit: the check for asid >= MLX5_VDPA_NUM_AS is redundant, as it will
> be already checked by VHOST_VDPA_SET_GROUP_ASID handler in
> drivers/vhost/vdpa.c:vhost_vdpa_vring_ioctl. Not a big deal.
Ack.
>
> > return -EINVAL;
> >
> > mvdev->group2asid[group] = asid;
> > @@ -3160,6 +3176,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
> > .get_vq_irq = mlx5_get_vq_irq,
> > .get_vq_align = mlx5_vdpa_get_vq_align,
> > .get_vq_group = mlx5_vdpa_get_vq_group,
> > + .get_vq_desc_group = mlx5_vdpa_get_vq_desc_group, /* Op disabled if
> > not supported. */
> > .get_device_features = mlx5_vdpa_get_device_features,
> > .set_driver_features = mlx5_vdpa_set_driver_features,
> > .get_driver_features = mlx5_vdpa_get_driver_features,
> > @@ -3258,6 +3275,7 @@ struct mlx5_vdpa_mgmtdev {
> > struct vdpa_mgmt_dev mgtdev;
> > struct mlx5_adev *madev;
> > struct mlx5_vdpa_net *ndev;
> > + struct vdpa_config_ops vdpa_ops;
> > };
> >
> > static int config_func_mtu(struct mlx5_core_dev *mdev, u16 mtu)
> > @@ -3371,7 +3389,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev
> > *v_mdev, const char *name,
> > max_vqs = 2;
> > }
> >
> > - ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev-
> > >device, &mlx5_vdpa_ops,
> > + ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev-
> > >device, &mgtdev->vdpa_ops,
> > MLX5_VDPA_NUMVQ_GROUPS, MLX5_VDPA_NUM_AS,
> > name, false);
> > if (IS_ERR(ndev))
> > return PTR_ERR(ndev);
> > @@ -3546,6 +3564,10 @@ static int mlx5v_probe(struct auxiliary_device *adev,
> > MLX5_CAP_DEV_VDPA_EMULATION(mdev, max_num_virtio_queues) +
> > 1;
> > mgtdev->mgtdev.supported_features = get_supported_features(mdev);
> > mgtdev->madev = madev;
> > + mgtdev->vdpa_ops = mlx5_vdpa_ops;
> > +
> > + if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, desc_group_mkey_supported))
> > + mgtdev->vdpa_ops.get_vq_desc_group = NULL;
>
> I think this is better handled by splitting mlx5_vdpa_ops in two: One
> with get_vq_desc_group and other without it. You can see an example of
> this in the simulator, where one version supports .dma_map incremental
> updating with .dma_map and the other supports .set_map. Otherwise,
> this can get messy if more members opt-out or opt-in.
>
I implemented it this way because the upcoming resumable vq support will also
need to selectively implement .resume if the hw capability is there. That would
result in needing 4 different ops for all combinations. The other option would
be to force these two ops together (.get_vq_desc_group and .resume). But I would
prefer to not do that.
> But I'm ok with this too, so whatever version you choose:
>
> Acked-by: Eugenio Pérez <eperezma@...hat.com>
>
> >
> > err = vdpa_mgmtdev_register(&mgtdev->mgtdev);
> > if (err)
> > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > index 9becdc3fa503..b86d51a855f6 100644
> > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > @@ -74,7 +74,11 @@ struct mlx5_ifc_virtio_q_bits {
> > u8 reserved_at_320[0x8];
> > u8 pd[0x18];
> >
> > - u8 reserved_at_340[0xc0];
> > + u8 reserved_at_340[0x20];
> > +
> > + u8 desc_group_mkey[0x20];
> > +
> > + u8 reserved_at_380[0x80];
> > };
> >
> > struct mlx5_ifc_virtio_net_q_object_bits {
> > @@ -141,6 +145,7 @@ enum {
> > MLX5_VIRTQ_MODIFY_MASK_STATE = (u64)1 << 0,
> > MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS = (u64)1 << 3,
> > MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > + MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY = (u64)1 << 14,
> > };
> >
> > enum {
> > --
> > 2.41.0
> >
>
Powered by blists - more mailing lists