[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1752752567.git.leon@kernel.org>
Date: Thu, 17 Jul 2025 15:17:24 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Andrew Lunn <andrew+netdev@...n.ch>,
Bernard Metzler <bmt@...ich.ibm.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Bryan Tan <bryan-bt.tan@...adcom.com>,
Chengchang Tang <tangchengchang@...wei.com>,
Cheng Xu <chengyou@...ux.alibaba.com>,
Christian Benvenuti <benve@...co.com>,
Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>,
Edward Srouji <edwards@...dia.com>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Junxian Huang <huangjunxian6@...ilicon.com>,
Kai Shen <kaishen@...ux.alibaba.com>,
Kalesh AP <kalesh-anakkur.purayil@...adcom.com>,
Konstantin Taranov <kotaranov@...rosoft.com>,
linux-pci@...r.kernel.org,
linux-rdma@...r.kernel.org,
Long Li <longli@...rosoft.com>,
Michael Margolin <mrgolin@...zon.com>,
Michal Kalderon <mkalderon@...vell.com>,
Moshe Shemesh <moshe@...dia.com>,
Mustafa Ismail <mustafa.ismail@...el.com>,
Nelson Escobar <neescoba@...co.com>,
netdev@...r.kernel.org,
Paolo Abeni <pabeni@...hat.com>,
Potnuri Bharat Teja <bharat@...lsio.com>,
Saeed Mahameed <saeedm@...dia.com>,
Selvin Xavier <selvin.xavier@...adcom.com>,
Tariq Toukan <tariqt@...dia.com>,
Tatyana Nikolova <tatyana.e.nikolova@...el.com>,
Vishnu Dasa <vishnu.dasa@...adcom.com>,
Yishai Hadas <yishaih@...dia.com>,
Zhu Yanjun <zyjzyj2000@...il.com>
Subject: [PATCH rdma-next v2 0/8] RDMA support for DMA handle
Changelog:
v2:
* Removed check of existence of function pointers in favour of uverbs macro.
v1: https://lore.kernel.org/all/cover.1752388126.git.leon@kernel.org
* Added Bjorn's Acked-by on PCI patch.
* Changed title of first PCI patch.
* Changed hns and efa to count not-supported commands.
* Slightly changed protection of mlx5 SF parent_mdev access.
* Moved SIW debug print to be before dmah check.
v0:https://lore.kernel.org/all/cover.1751907231.git.leon@kernel.org
--------------------------------------------------------------------
>From Yishai,
This patch series introduces a new DMA Handle (DMAH) object, along with
corresponding APIs for its allocation and deallocation.
The DMAH object encapsulates attributes relevant for DMA transactions.
While initially intended to support TLP Processing Hints (TPH) [1], the
design is extensible to accommodate future features such as PCI
multipath for DMA, PCI UIO configurations, traffic class selection, and
more.
Additionally, we introduce a new ioctl method on the MR object:
UVERBS_METHOD_REG_MR.
This method consolidates multiple reg_mr variants under a single
user-space ioctl interface, supporting: ibv_reg_mr(), ibv_reg_mr_iova(),
ibv_reg_mr_iova2() and ibv_reg_dmabuf_mr(). It also enables passing a
DMA handle as part of the registration process.
Throughout the patch series, the following DMAH-related stuff can also
be observed in the IB layer:
- Association with a CPU ID and its memory type, for use with Steering
Tags [2].
- Inclusion of Processing Hints (PH) data for TPH functionality [3].
- Enforces security by ensuring that only tasks allowed to run on a
given CPU may request a DMA handle for it.
- Reference counting for DMAH life cycle management and safe usage
across memory regions.
mlx5 driver implementation:
--------------------------
The series includes implementation of the above functionality in the
mlx5 driver.
In mlx5_core:
- Enables TPH over PCIe when both firmware and OS support it.
- Manages Steering Tags and corresponding indices by writing tag values
to the PCI configuration space.
- Exposes APIs to upper layers (e.g., mlx5_ib) to enable the PCIe TPH
functionality.
In mlx5_ib:
- Adds full support for DMAH operations.
- Utilizes mlx5_core's Steering Tag APIs to derive tag indices from
input.
- Stores the resulting index in a mlx5_dmah structure for use during
MKEY creation with a DMA handle.
- Adds support for allowing MKEYs to be created in conjunction with DMA
handles.
Additional details are provided in the commit messages.
[1] Background, from PCIe specification 6.2.
TLP Processing Hints (TPH)
--------------------------
TLP Processing Hints is an optional feature that provides hints in
Request TLP headers to facilitate optimized processing of Requests that
target Memory Space. These Processing Hints enable the system hardware
(e.g., the Root Complex and/ or Endpoints) to optimize platform
resources such as system and memory interconnect on a per TLP basis.
Steering Tags are system-specific values used to identify a processing
resource that a Requester explicitly targets. System software discovers
and identifies TPH capabilities to determine the Steering Tag allocation
for each Function that supports TPH
[2] Steering Tags
Functions that intend to target a TLP towards a specific processing
resource such as a host processor or system cache hierarchy require
topological information of the target cache (e.g., which host cache).
Steering Tags are system-specific values that provide information about
the host or cache structure in the system cache hierarchy. These values
are used to associate processing elements within the platform with the
processing of Requests.
[3] Processing Hints
The Requester provides hints to the Root Complex or other targets about
the intended use of data and data structures by the host and/or device.
The hints are provided by the Requester, which has knowledge of upcoming
Request patterns, and which the Completer would not be able to deduce
autonomously (with good accuracy)
Yishai
Yishai Hadas (8):
PCI/TPH: Expose pcie_tph_get_st_table_size()
net/mlx5: Expose IFC bits for TPH
net/mlx5: Add support for device steering tag
IB/core: Add UVERBS_METHOD_REG_MR on the MR object
RDMA/core: Introduce a DMAH object and its alloc/free APIs
RDMA/mlx5: Add DMAH object support
IB: Extend UVERBS_METHOD_REG_MR to get DMAH
RDMA/mlx5: Add DMAH support for reg_user_mr/reg_user_dmabuf_mr
drivers/infiniband/core/Makefile | 1 +
drivers/infiniband/core/device.c | 3 +
drivers/infiniband/core/rdma_core.h | 1 +
drivers/infiniband/core/restrack.c | 2 +
drivers/infiniband/core/uverbs_cmd.c | 2 +-
.../infiniband/core/uverbs_std_types_dmah.c | 145 +++++++++++++++
drivers/infiniband/core/uverbs_std_types_mr.c | 172 +++++++++++++++++-
drivers/infiniband/core/uverbs_uapi.c | 1 +
drivers/infiniband/core/verbs.c | 5 +-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 8 +
drivers/infiniband/hw/bnxt_re/ib_verbs.h | 2 +
drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 1 +
drivers/infiniband/hw/cxgb4/mem.c | 6 +-
drivers/infiniband/hw/efa/efa.h | 2 +
drivers/infiniband/hw/efa/efa_verbs.c | 12 ++
drivers/infiniband/hw/erdma/erdma_verbs.c | 6 +-
drivers/infiniband/hw/erdma/erdma_verbs.h | 3 +-
drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
drivers/infiniband/hw/hns/hns_roce_mr.c | 6 +
drivers/infiniband/hw/irdma/verbs.c | 9 +
drivers/infiniband/hw/mana/mana_ib.h | 2 +
drivers/infiniband/hw/mana/mr.c | 8 +
drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 +
drivers/infiniband/hw/mlx4/mr.c | 4 +
drivers/infiniband/hw/mlx5/Makefile | 1 +
drivers/infiniband/hw/mlx5/devx.c | 4 +
drivers/infiniband/hw/mlx5/dmah.c | 54 ++++++
drivers/infiniband/hw/mlx5/dmah.h | 23 +++
drivers/infiniband/hw/mlx5/main.c | 5 +
drivers/infiniband/hw/mlx5/mlx5_ib.h | 7 +
drivers/infiniband/hw/mlx5/mr.c | 106 +++++++++--
drivers/infiniband/hw/mlx5/odp.c | 1 +
drivers/infiniband/hw/mthca/mthca_provider.c | 6 +-
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 +-
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +-
drivers/infiniband/hw/qedr/verbs.c | 6 +-
drivers/infiniband/hw/qedr/verbs.h | 3 +-
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 4 +
drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 1 +
drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 5 +
.../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h | 1 +
drivers/infiniband/sw/rdmavt/mr.c | 5 +
drivers/infiniband/sw/rdmavt/mr.h | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.c | 4 +
drivers/infiniband/sw/siw/siw_verbs.c | 7 +-
drivers/infiniband/sw/siw/siw_verbs.h | 3 +-
.../net/ethernet/mellanox/mlx5/core/Makefile | 5 +
.../net/ethernet/mellanox/mlx5/core/lib/st.c | 164 +++++++++++++++++
.../net/ethernet/mellanox/mlx5/core/main.c | 2 +
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 9 +
drivers/pci/tph.c | 11 +-
include/linux/mlx5/driver.h | 20 ++
include/linux/mlx5/mlx5_ifc.h | 14 +-
include/linux/pci-tph.h | 1 +
include/rdma/ib_verbs.h | 29 +++
include/rdma/restrack.h | 4 +
include/uapi/rdma/ib_user_ioctl_cmds.h | 32 ++++
57 files changed, 909 insertions(+), 41 deletions(-)
create mode 100644 drivers/infiniband/core/uverbs_std_types_dmah.c
create mode 100644 drivers/infiniband/hw/mlx5/dmah.c
create mode 100644 drivers/infiniband/hw/mlx5/dmah.h
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/st.c
--
2.50.1
Powered by blists - more mailing lists