[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250704122753.845841-1-cjubran@nvidia.com>
Date: Fri, 4 Jul 2025 15:27:51 +0300
From: Carolina Jubran <cjubran@...dia.com>
To: <stephen@...workplumber.org>, <dsahern@...il.com>
CC: Jiri Pirko <jiri@...dia.com>, <netdev@...r.kernel.org>, Carolina Jubran
<cjubran@...dia.com>
Subject: [PATCH iproute2-next 0/2] Add support for traffic class bandwidth configuration via devlink-rate
This series adds support for configuring bandwidth allocation per
traffic class (TC) through the devlink-rate interface. It introduces a
new 'tc-bw' attribute, allowing users to define how bandwidth is
distributed across up to 8 traffic classes in a single command. This
enables fine-grained traffic shaping and supports use cases such as
Enhanced Transmission Selection (ETS) as defined by IEEE 802.1Qaz.
Example commands:
- devlink port function rate add pci/0000:08:00.0/group \
tx_share 10Gbit tx_max 50Gbit tc-bw 0:20 1:0 2:0 3:0 4:0 5:80 6:0 7:0
Sets tc-bw on a rate node named 'group'; traffic classes 0 and
5 will get relative shares of 20 and 80 respectively.
- devlink port function rate set pci/0000:08:00.0/1 \
tc-bw 0:20 1:0 2:0 3:0 4:0 5:80 6:0 7:0
Updates traffic class bandwidth shares on port 1.
- devlink port function rate set pci/0000:08:00.0/1 \
tc-bw 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0
Disables tc-bw on port 1.
**Classification model and queue behavior**
In setups using traffic classes, classification could be performed
based on VLAN PCP or DSCP bits. These are mapped to traffic class
indices by the hypervisor or device configuration.
Each transmit queue is expected to carry traffic for a single traffic
class. Mixing different classes in the same queue can lead to
head-of-line blocking and scheduler misbehavior. The hypervisor ensures
that traffic flows are mapped to the correct queue, and the hardware
uses that queue's identity to assign the packet to the appropriate
traffic class scheduler.
The 'tc-bw' configuration assumes that this model is respected: each
traffic class should correspond to one or more queues that carry
traffic only for that class. Bandwidth shares are enforced per class,
not per queue.
Thanks
Carolina Jubran (2):
devlink: Update uapi headers
devlink: Add support for 'tc-bw' attribute in devlink-rate
devlink/devlink.c | 191 +++++++++++++++++++++++++++++++++--
include/uapi/linux/devlink.h | 9 ++
man/man8/devlink-rate.8 | 14 +++
3 files changed, 208 insertions(+), 6 deletions(-)
--
2.38.1
Powered by blists - more mailing lists