[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251120131140.GT17968@ziepe.ca>
Date: Thu, 20 Nov 2025 09:11:40 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Zhiping Zhang <zhipingz@...a.com>
Cc: Leon Romanovsky <leon@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
linux-rdma@...r.kernel.org, linux-pci@...r.kernel.org,
netdev@...r.kernel.org, Keith Busch <kbusch@...nel.org>,
Yochai Cohen <yochai@...dia.com>, Yishai Hadas <yishaih@...dia.com>
Subject: Re: [RFC 2/2] Set steering-tag directly for PCIe P2P memory access
On Wed, Nov 19, 2025 at 11:24:40PM -0800, Zhiping Zhang wrote:
> On Monday, November 17, 2025 at 8:00 AM, Jason Gunthorpe wrote:
> > Re: [RFC 2/2] Set steering-tag directly for PCIe P2P memory access
> >
> > On Thu, Nov 13, 2025 at 01:37:12PM -0800, Zhiping Zhang wrote:
> > > RDMA: Set steering-tag value directly in DMAH struct for DMABUF MR
> > >
> > > This patch enables construction of a dma handler (DMAH) with the P2P memory type
> > > and a direct steering-tag value. It can be used to register a RDMA memory
> > > region with DMABUF for the RDMA NIC to access the other device's memory via P2P.
> > >
> > > Signed-off-by: Zhiping Zhang <zhipingz@...a.com>
> > > ---
> > > .../infiniband/core/uverbs_std_types_dmah.c | 28 +++++++++++++++++++
> > > drivers/infiniband/core/uverbs_std_types_mr.c | 3 ++
> > > drivers/infiniband/hw/mlx5/dmah.c | 5 ++--
> > > .../net/ethernet/mellanox/mlx5/core/lib/st.c | 12 +++++---
> > > include/linux/mlx5/driver.h | 4 +--
> > > include/rdma/ib_verbs.h | 2 ++
> > > include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
> > > 7 files changed, 46 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/uverbs_std_types_dmah.c b/drivers/infiniband/core/uverbs_std_types_dmah.c
> > > index 453ce656c6f2..1ef400f96965 100644
> > > --- a/drivers/infiniband/core/uverbs_std_types_dmah.c
> > > +++ b/drivers/infiniband/core/uverbs_std_types_dmah.c
> > > @@ -61,6 +61,27 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DMAH_ALLOC)(
> > > dmah->valid_fields |= BIT(IB_DMAH_MEM_TYPE_EXISTS);
> > > }
> > >
> > > + if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_ALLOC_DMAH_DIRECT_ST_VAL)) {
> > > + ret = uverbs_copy_from(&dmah->direct_st_val, attrs,
> > > + UVERBS_ATTR_ALLOC_DMAH_DIRECT_ST_VAL);
> > > + if (ret)
> > > + goto err;
> >
> > This should not come from userspace, the dmabuf exporter should
> > provide any TPH hints as part of the attachment process.
> >
> > We are trying not to allow userspace raw access to the TPH values, so
> > this is not a desirable UAPI here.
>
> Thanks for your feedback!
>
> I understand the concern about not exposing raw TPH values to
> userspace. To clarify, would it be acceptable to use an index-based
> mapping table, where userspace provides an index and the kernel
> translates it to the appropriate TPH value? Given that the PCIe spec
> allows up to 16-bit TPH values, this could require a mapping table
> of up to 128KB. Do you see this as a reasonable approach, or is
> there a preferred alternative?
?
The issue here is to secure the TPH. The kernel driver that owns the
exporting device should control what TPH values an importing driver
will use.
I don't see how an indirection table helps anything, you need to add
an API to DMABUF to retrieve the tph.
> Additionally, in cases where the dmabuf exporter device can handle all possible 16-bit
> TPH values (i.e., it has its own internal mapping logic or table), should this still be
> entirely abstracted away from userspace?
I imagine the exporting device provides the raw on the wire TPH value
it wants the importing device to use and the importing device is
responsible to program it using whatever scheme it has.
Jason
Powered by blists - more mailing lists