[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251222-frmr_pools-v2-0-f06a99caa538@nvidia.com>
Date: Mon, 22 Dec 2025 14:40:35 +0200
From: Edward Srouji <edwards@...dia.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Leon Romanovsky <leon@...nel.org>, "Saeed
Mahameed" <saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>, Mark Bloch
<mbloch@...dia.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
CC: <linux-kernel@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<netdev@...r.kernel.org>, Michael Guralnik <michaelgur@...dia.com>, "Edward
Srouji" <edwards@...dia.com>, Chiara Meiohas <cmeiohas@...dia.com>, "Yishai
Hadas" <yishaih@...dia.com>, Patrisious Haddad <phaddad@...dia.com>
Subject: [PATCH rdma-next v2 00/11] RDMA/core: Introduce FRMR pools
infrastructure
>From Michael:
This patch series introduces a new FRMR (Fast Registration Memory Region)
pool infrastructure for the RDMA core subsystem. The goal is to provide
efficient management and allow reuse of MRs (Memory Regions) for RDMA
device drivers.
Background
==========
Memory registration and deregistration can be a significant bottleneck in
RDMA applications that need to register memory regions dynamically in
their data path or must re-register memory on application restart.
Repeatedly allocating and freeing these resources introduces overhead,
particularly in high-throughput or latency-sensitive environments where
memory regions are frequently cycled. Notably, the mlx5_ib driver has
already adopted memory registration reuse mechanisms and has demonstrated
notable performance improvements as a result.
FRMR pools will store handles of the reusable objects, giving drivers
the flexibility to choose what to store (e.g: pointers or indexes).
Device driver integration requires the ability to modify the hardware
objects underlying MRs when reusing FRMR handles, allowing the update
of pre-allocated handles to fit the parameters of requested MR
registrations. The FRMR pools manage memory region handles with respect
to attributes that cannot be changed after allocation such as access flags,
ATS capabilities, vendor keys, and DMA block size so each pool is uniquely
characterized by these non-modifiable attributes.
This ensures compatibility and correctness while allowing drivers
flexibility in managing other aspects of the MR lifecycle.
Solution Overview
=================
This patch series introduces a centralized, per-device FRMR pooling
infrastructure that provides:
1. Pool Organization: Uses an RB-tree to organize pools by FRMR
characteristics (ATS support, access flags, vendor-specific keys,
and DMA block count). This allows efficient lookup and reuse of
compatible FRMR handles.
2. Dynamic Allocation: Pools grow dynamically on demand when no cached
handles are available, ensuring optimal memory usage without
sacrificing performance.
3. Aging Mechanism: Implements an aging system. Unused handles are
gradually moved to the freed after a configurable aging period
(default: 60 seconds), preventing memory bloat during idle periods.
4. Pinned Handles: Supports pinning a minimum number of handles per
pool to maintain performance for latency-sensitive workloads, avoiding
allocation overhead on critical paths.
5. Driver Flexibility: Provides a callback-based interface
(ib_frmr_pool_ops) that allows drivers to implement their own FRMR
creation/destruction logic while leveraging the common pooling
infrastructure.
API
===
The infrastructure exposes the following APIs:
- ib_frmr_pools_init(): Initialize FRMR pools for a device
- ib_frmr_pools_cleanup(): Clean up all pools for a device
- ib_frmr_pool_pop(): Get an FRMR handle from the pool
- ib_frmr_pool_push(): Return an FRMR handle to the pool
- ib_frmr_pools_set_aging_period(): Configure aging period
- ib_frmr_pools_set_pinned(): Set minimum pinned handles per pool
mlx5_ib
=======
The partial control and visability we had only over the 'persistent'
cache entries through debugfs is replaced by the netlink FRMR API that
allows showing and setting properties of all available pools.
This series also changes the default behavior MR cache had for PFs
(Physical Functions) by dropping the pre-allocation of MKEYs that was
costing 100MB of memory per PF and slowing down the loading and
unloading of the driver.
Signed-off-by: Michael Guralnik <michaelgur@...dia.com>
Signed-off-by: Edward Srouji <edwards@...dia.com>
---
Changes in v2:
- Fix stack size warning in netlink set_pinned flow.
- Add commit to move async command context init and cleanup out of MR
cache logic.
- Add enforcement of access flags in set_pinned flow and enforce used
bits in vendor specific fields to ensure old kernels fail if any
unknown parameter is passed.
- Add an option to expose kernel-internal pools through netlink.
- Link to v1: https://lore.kernel.org/r/20251116-frmr_pools-v1-0-5eb3c8f5c9c4@nvidia.com
---
Chiara Meiohas (1):
RDMA/mlx5: Move device async_ctx initialization
Michael Guralnik (10):
IB/core: Introduce FRMR pools
RDMA/core: Add aging to FRMR pools
RDMA/core: Add FRMR pools statistics
RDMA/core: Add pinned handles to FRMR pools
RDMA/mlx5: Switch from MR cache to FRMR pools
net/mlx5: Drop MR cache related code
RDMA/nldev: Add command to get FRMR pools
RDMA/core: Add netlink command to modify FRMR aging
RDMA/nldev: Add command to set pinned FRMR handles
RDMA/nldev: Expose kernel-internal FRMR pools in netlink
drivers/infiniband/core/Makefile | 2 +-
drivers/infiniband/core/frmr_pools.c | 557 ++++++++++++
drivers/infiniband/core/frmr_pools.h | 63 ++
drivers/infiniband/core/nldev.c | 286 ++++++
drivers/infiniband/hw/mlx5/main.c | 10 +-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 86 +-
drivers/infiniband/hw/mlx5/mr.c | 1145 ++++--------------------
drivers/infiniband/hw/mlx5/odp.c | 19 -
drivers/infiniband/hw/mlx5/umr.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/main.c | 67 +-
include/linux/mlx5/driver.h | 11 -
include/rdma/frmr_pools.h | 39 +
include/rdma/ib_verbs.h | 8 +
include/uapi/rdma/rdma_netlink.h | 22 +
14 files changed, 1171 insertions(+), 1145 deletions(-)
---
base-commit: d056bc45b62b5981ebcd18c4303a915490b8ebe9
change-id: 20251116-frmr_pools-f823cc5e8a58
Best regards,
--
Edward Srouji <edwards@...dia.com>
Powered by blists - more mailing lists