[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241113203317.2507537-2-cratiu@nvidia.com>
Date: Wed, 13 Nov 2024 22:30:38 +0200
From: Cosmin Ratiu <cratiu@...dia.com>
To: <netdev@...r.kernel.org>
CC: <jiri@...nulli.us>, <tariqt@...dia.com>, <kuba@...nel.org>,
<saeedm@...dia.com>, <cratiu@...dia.com>
Subject: [PATCH 01/10] [Cover Letter] devlink: Introduce rate domains
devlink objects support rate management for tx scheduling, which
involves maintaining a tree of rate nodes that corresponds to tx
schedulers in hardware. 'man devlink-rate' has the full details.
The tree of rate nodes is maintained per devlink object, protected by
the devlink lock.
There exists hardware capable of instantiating a tx scheduling tree
which spans multiple functions of the same physical device (and thus
devlink objects) and therefore the current API and locking scheme is
insufficient.
This patch series changes the devlink rate implementation and API to
allow supporting such hardware and managing tx scheduling trees across
multiple functions of a physical device.
Modeling this requires having devlink rate nodes with parents in other
devlink objects. A naive approach that relies on the current
one-lock-per-devlink model is impossible, as it would require in some
cases acquiring multiple devlink locks in the correct order.
The solution proposed is to move rates in a separate object named 'rate
domain'. Devlink objects create a private rate domain on init and
hardware that supports cross-function tx scheduling can switch to using
a shared rate domain for a set of devlink objects. Shared rate domains
have an additional lock serializing access to rate notes.
A new pair of devlink attributes is introduced for specifying a foreign
parent device as well as changes to the rate management devlink calls to
allow setting a rate node parent to the requested foreign parent device.
Finally, this API is used from mlx5 for NICs with the correct capability
bit to allow cross-function tx scheduling.
Patches:
Small cleanup:
devlink: Remove unused param of devlink_rate_nodes_check
Introduce private rate domains:
devlink: Store devlink rates in a rate domain
Introduce rate domain locking (noop atm as rate domains are private):
devlink: Serialize access to rate domains
Introduce shared rate domains and a global registry for them:
devlink: Introduce shared rate domains
Extend the devlink rate API with foreign parent devices:
devlink: Allow specifying parent device for rate commands
devlink: Allow rate node parents from other devlinks
Extends mlx5 implementation with the ability to share qos domains:
net/mlx5: qos: Introduce shared esw qos domains
Use the newly introduced stuff to support cross-function tx scheduling:
net/mlx5: qos: Support cross-esw scheduling in qos.c
net/mlx5: qos: Init shared devlink rate domain
Issue: 3645895
Change-Id: If03c5c0562bf4b53fe2fa7b8a070207f6715a755
Signed-off-by: Cosmin Ratiu <cratiu@...dia.com>
--
2.43.2
Powered by blists - more mailing lists