[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251218-vf-bw-lag-mode-v1-0-7d8ed4368bea@nvidia.com>
Date: Thu, 18 Dec 2025 17:57:50 +0200
From: Edward Srouji <edwards@...dia.com>
To: <edwards@...dia.com>, Leon Romanovsky <leon@...nel.org>, Saeed Mahameed
<saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>, Mark Bloch
<mbloch@...dia.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Jason Gunthorpe
<jgg@...pe.ca>
CC: <netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Or Har-Toov <ohartoov@...dia.com>, "Maher
Sanalla" <msanalla@...dia.com>
Subject: [PATCH rdma-next 00/10] Support effective VF bandwidth query in LAG mode
Currently, mlx5 driver exposes only the parent function's speed to VFs,
providing no way to query the actual effective bandwidth in LAG and
MPESW configurations. This limitation prevents userspace and
upper-layer software from obtaining accurate bandwidth information,
which impacts traffic scheduling decisions.
This series addresses this by:
1. Adding mlx5 internal logic to calculate and propagate the effective
aggregated LAG speed to all attached vports. The vport speeds are
dynamically updated when LAG member link states change.
2. Extending RDMA core with a new ib_query_port_speed() verb and an
IB_EVENT_DEVICE_SPEED_CHANGE async event. These interfaces expose
the effective port speed to userspace, supporting speeds that are
not expressible as IB speed * width (where width is 2^n).
This series enables userspace applications to query the effective
port speed and receive notifications on speed changes in real-time.
In LAG configurations, each mlx5 port reports the aggregated bandwidth
of all active LAG members.
The series spans rdma-next and mlx5-next.
Patches 1–4 target rdma-next (IB/core, RDMA/mlx5).
Patches 5–10 target mlx5-next (net/mlx5).
The patches are tightly coupled and cannot be reasonably split.
It is sent as a single series to allow coherent review.
Signed-off-by: Edward Srouji <edwards@...dia.com>
---
Or Har-Toov (10):
net/mlx5: Add max_tx_speed and its CAP bit to IFC
net/mlx5: Propagate LAG effective max_tx_speed to vports
net/mlx5: Handle port and vport speed change events in MPESW
net/mlx5: Add support for querying bond speed
IB/core: Add async event on device speed change
IB/core: Add helper to convert port attributes to data rate
IB/core: Refactor rate_show to use ib_port_attr_to_rate()
IB/core: Add query_port_speed verb
RDMA/mlx5: Raise async event on device speed change
RDMA/mlx5: Implement query_port_speed callback
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/core/sysfs.c | 56 +-----
drivers/infiniband/core/uverbs_std_types_device.c | 42 ++++
drivers/infiniband/core/verbs.c | 52 +++++
drivers/infiniband/hw/mlx5/main.c | 132 +++++++++++++
drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 215 +++++++++++++++++++++
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 11 ++
.../net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 39 ++++
.../net/ethernet/mellanox/mlx5/core/lag/mpesw.h | 14 ++
.../net/ethernet/mellanox/mlx5/core/mlx5_core.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/port.c | 24 +++
drivers/net/ethernet/mellanox/mlx5/core/vport.c | 74 +++++++
include/linux/mlx5/driver.h | 1 +
include/linux/mlx5/mlx5_ifc.h | 9 +-
include/linux/mlx5/vport.h | 6 +
include/rdma/ib_verbs.h | 17 ++
include/uapi/rdma/ib_user_ioctl_cmds.h | 6 +
18 files changed, 651 insertions(+), 51 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20251218-vf-bw-lag-mode-ae720fe54e51
Best regards,
--
Edward Srouji <edwards@...dia.com>
Powered by blists - more mailing lists