[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251120072442.2292818-1-zhipingz@meta.com>
Date: Wed, 19 Nov 2025 23:24:40 -0800
From: Zhiping Zhang <zhipingz@...a.com>
To: Jason Gunthorpe <jgg@...pe.ca>
CC: Leon Romanovsky <leon@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
<linux-rdma@...r.kernel.org>, <linux-pci@...r.kernel.org>,
<netdev@...r.kernel.org>, Keith Busch <kbusch@...nel.org>,
Yochai Cohen
<yochai@...dia.com>, Yishai Hadas <yishaih@...dia.com>
Subject: Re: [RFC 2/2] Set steering-tag directly for PCIe P2P memory access
On Monday, November 17, 2025 at 8:00 AM, Jason Gunthorpe wrote:
> Re: [RFC 2/2] Set steering-tag directly for PCIe P2P memory access
>
> On Thu, Nov 13, 2025 at 01:37:12PM -0800, Zhiping Zhang wrote:
> > RDMA: Set steering-tag value directly in DMAH struct for DMABUF MR
> >
> > This patch enables construction of a dma handler (DMAH) with the P2P memory type
> > and a direct steering-tag value. It can be used to register a RDMA memory
> > region with DMABUF for the RDMA NIC to access the other device's memory via P2P.
> >
> > Signed-off-by: Zhiping Zhang <zhipingz@...a.com>
> > ---
> > .../infiniband/core/uverbs_std_types_dmah.c | 28 +++++++++++++++++++
> > drivers/infiniband/core/uverbs_std_types_mr.c | 3 ++
> > drivers/infiniband/hw/mlx5/dmah.c | 5 ++--
> > .../net/ethernet/mellanox/mlx5/core/lib/st.c | 12 +++++---
> > include/linux/mlx5/driver.h | 4 +--
> > include/rdma/ib_verbs.h | 2 ++
> > include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
> > 7 files changed, 46 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/uverbs_std_types_dmah.c b/drivers/infiniband/core/uverbs_std_types_dmah.c
> > index 453ce656c6f2..1ef400f96965 100644
> > --- a/drivers/infiniband/core/uverbs_std_types_dmah.c
> > +++ b/drivers/infiniband/core/uverbs_std_types_dmah.c
> > @@ -61,6 +61,27 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DMAH_ALLOC)(
> > dmah->valid_fields |= BIT(IB_DMAH_MEM_TYPE_EXISTS);
> > }
> >
> > + if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_ALLOC_DMAH_DIRECT_ST_VAL)) {
> > + ret = uverbs_copy_from(&dmah->direct_st_val, attrs,
> > + UVERBS_ATTR_ALLOC_DMAH_DIRECT_ST_VAL);
> > + if (ret)
> > + goto err;
>
> This should not come from userspace, the dmabuf exporter should
> provide any TPH hints as part of the attachment process.
>
> We are trying not to allow userspace raw access to the TPH values, so
> this is not a desirable UAPI here.
>
> Jason
Thanks for your feedback!
I understand the concern about not exposing raw TPH values to userspace.
To clarify, would it be acceptable to use an index-based mapping table,
where userspace provides an index and the kernel translates it to the
appropriate TPH value? Given that the PCIe spec allows up to 16-bit TPH values,
this could require a mapping table of up to 128KB. Do you see this as a reasonable
approach, or is there a preferred alternative?
Additionally, in cases where the dmabuf exporter device can handle all possible 16-bit
TPH values (i.e., it has its own internal mapping logic or table), should this still be
entirely abstracted away from userspace?
Zhiping
Powered by blists - more mailing lists