[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240219100555.7220-6-mateusz.polchlopek@intel.com>
Date: Mon, 19 Feb 2024 05:05:59 -0500
From: Mateusz Polchlopek <mateusz.polchlopek@...el.com>
To: intel-wired-lan@...ts.osuosl.org
Cc: netdev@...r.kernel.org,
horms@...nel.org,
przemyslaw.kitszel@...el.com,
Michal Wilczynski <michal.wilczynski@...el.com>,
Mateusz Polchlopek <mateusz.polchlopek@...el.com>
Subject: [Intel-wired-lan] [PATCH iwl-next v4 5/5] ice: Document tx_scheduling_layers parameter
From: Michal Wilczynski <michal.wilczynski@...el.com>
New driver specific parameter 'tx_scheduling_layers' was introduced.
Describe parameter in the documentation.
Signed-off-by: Michal Wilczynski <michal.wilczynski@...el.com>
Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@...el.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@...el.com>
---
Documentation/networking/devlink/ice.rst | 41 ++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index efc6be109dc3..1ae46dee0fd5 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -36,6 +36,47 @@ Parameters
The latter allows for bandwidth higher than external port speed
when looping back traffic between VFs. Works with 8x10G and 4x25G
cards.
+ * - ``tx_scheduling_layers``
+ - permanent
+ - The ice hardware uses hierarchical scheduling for Tx with a fixed
+ number of layers in the scheduling tree. Root node is representing a
+ port, while all the leaves represents the queues. This way of
+ configuring Tx scheduler allows features like DCB or devlink-rate
+ (documented below) for fine-grained configuration how much BW is given
+ to any given queue or group of queues, as scheduling parameters can be
+ configured at any given layer of the tree. By default 9-layer tree
+ topology was deemed best for most workloads, as it gives optimal
+ performance to configurability ratio. However for some specific cases,
+ this might not be the case. A great example would be sending traffic to
+ queues that is not a multiple of 8. Since in 9-layer topology maximum
+ number of children is limited to 8, the 9th queue has a different parent
+ than the rest, and it's given more BW credits. This causes a problem
+ when the system is sending traffic to 9 queues:
+
+ | tx_queue_0_packets: 24163396
+ | tx_queue_1_packets: 24164623
+ | tx_queue_2_packets: 24163188
+ | tx_queue_3_packets: 24163701
+ | tx_queue_4_packets: 24163683
+ | tx_queue_5_packets: 24164668
+ | tx_queue_6_packets: 23327200
+ | tx_queue_7_packets: 24163853
+ | tx_queue_8_packets: 91101417 < Too much traffic is sent to 9th
+
+ Sometimes this might be a big concern, so the idea is to empower the
+ user to switch to 5-layer topology, enabling performance gains but
+ sacrificing configurability for features like DCB and devlink-rate.
+
+ This parameter gives user flexibility to choose the 5-layer transmit
+ scheduler topology. After switching parameter reboot is required for
+ the feature to start working.
+
+ User could choose 9 (the default) or 5 as a value of parameter, e.g.:
+ $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers
+ value 5 cmode permanent
+
+ And verify that value has been set:
+ $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers
Info versions
=============
--
2.38.1
Powered by blists - more mailing lists