[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1521624629-46666-1-git-send-email-talgi@mellanox.com>
Date: Wed, 21 Mar 2018 11:30:29 +0200
From: Tal Gilboa <talgi@...lanox.com>
To: "David S. Miller" <davem@...emloft.net>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Tariq Toukan <tariqt@...lanox.com>,
Tal Gilboa <talgi@...lanox.com>
Subject: [PATCH net-next] Documentation/networking: Add net DIM documentation
Net DIM is a generic algorithm, purposed for dynamically
optimizing network devices interrupt moderation. This
document describes how it works and how to use it.
Signed-off-by: Tal Gilboa <talgi@...lanox.com>
---
Documentation/networking/net_dim.txt | 174 +++++++++++++++++++++++++++++++++++
1 file changed, 174 insertions(+)
create mode 100644 Documentation/networking/net_dim.txt
diff --git a/Documentation/networking/net_dim.txt b/Documentation/networking/net_dim.txt
new file mode 100644
index 0000000..ef622c8
--- /dev/null
+++ b/Documentation/networking/net_dim.txt
@@ -0,0 +1,174 @@
+Net DIM - Generic Network Dynamic Interrupt Moderation
+======================================================
+
+Author:
+ Tal Gilboa <talgi@...lanox.com>
+
+
+Contents
+=========
+
+- Assumptions
+- Introduction
+- The Net DIM Algorithm
+- Registering a Network Device to DIM
+- Example
+
+Part 0: Assumptions
+======================
+
+This document assumes the reader has basic knowledge in network drivers
+and in general interrupt moderation.
+
+
+Part I: Introduction
+======================
+
+Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the interrupt
+moderation configuration of a channel in order to optimize packet processing.
+The mechanism includes an algorithm which decides if and how to change
+moderation parameters for a channel, usually by performing an analysis on
+runtime data sampled from the system. Net DIM is such a mechanism. In each
+iteration of the algorithm, it analyses a given sample of the data, compares it to
+the previous sample and if required, is can decide to change some of the interrupt moderation
+configuration fields. The data sample is composed of data bandwidth, the number of
+packets and the number of events. The time between samples is also measured. Net DIM
+compares the current and the previous data and returns an adjusted interrupt
+moderation configuration object. In some cases, the algorithm might decide not
+to change anything. The configuration fields are the minimum duration
+(microseconds) allowed between events and the maximum number of wanted packets
+per event. The Net DIM algorithm ascribes importance to increase bandwidth over
+reducing interrupt rate.
+
+
+Part II: The Net DIM Algorithm
+===============================
+
+Each iteration of the Net DIM algorithm follows these steps:
+1. Calculates new data sample.
+2. Compares it to previous sample.
+3. Makes a decision - suggests interrupt moderation configuration fields.
+4. Applies a schedule work function, which applies suggested configuration.
+
+The first two steps are straight forward, both the new and the previous data are
+supplied by the driver registered to Net DIM. The previous data is the new data
+supplied to the previous iteration. The comparison step checks the difference
+between the new and previous data and decides on the result of the last step. A step
+would result as "better" if bandwidth increases and as "worse" if bandwidth
+reduces. If there is no change in bandwidth, the packet rate is compared in a similar
+fashion - increase == "better" and decrease == "worse". In case there is no
+change in the packet rate as well, the interrupt rate is compared. Here the
+algorithm tries to optimize for lower interrupt rate so an increase in the
+interrupt rate is considered "worse" and a decrease is considered "better".
+Step #2 has an optimization for avoiding false results, it only considers a
+difference between samples as valid if it is greater than a certain percentage.
+Also, since Net DIM does not measure anything by itself, it assumes the data
+provided by the driver is valid.
+
+Step #3 decides on the suggested configuration based on the result from step #2
+and the internal state of the algorithm. The states reflect the "direction" of
+the algorithm, is it going left (reducing moderation), right (increasing
+moderation) or standing still. Another optimization is that if a decision
+to stay still is made multiple times, the interval between iterations of the
+algorithm would increase in order to reduce calculation overhead. Also, after
+"parking" on one of the most left or most right decisions, the algorithm may
+decide to verify this decision by taking a step on the other direction. This is
+done in order to avoid getting stuck in a "deep sleep" scenario. Once a
+decision is made, an interrupt moderation configuration is selected from
+the predefined profiles.
+
+The last step is to notify the registered driver that it should apply the suggested
+configuration. This is done by scheduling a work function, defined by the Net DIM
+API and provided by the registered driver.
+
+As you can see, Net DIM itself does not actively interact with the system. It
+would have trouble making the correct decisions if the wrong data is supplied to it
+and it would be useless if the work function would not apply the suggested
+configuration. This does, however, allows the registered driver some room for manoeuvre
+as it may provide partial data or ignore the algorithm suggestion under some
+conditions.
+
+
+Part III: Registering a Network Device to DIM
+==============================================
+
+Net DIM API exposes the main function net_dim(struct net_dim *dim,
+struct net_dim_sample end_sample). This function is the entry point to the Net
+DIM algorithm and has to be called every time the driver would like to check if
+it should change interrupt moderation parameters. The driver should provide two
+data structures: struct net_dim and struct net_dim_sample. Struct net_dim
+describes the state of DIM for a specific object (RX queue, TX queue,
+other queues, etc.). This includes the current selected profile, previous data
+samples, the callback function provided by the driver and more.
+Struct net_dim_sample describes a data sample, which will be compared to the
+data sample stored in struct net_dim in order to decide on the algorithm's next
+step. The sample should include bytes, packets and interrupts, measured by
+the driver.
+
+In order to use Net DIM from a networking driver, the driver needs to call the
+main net_dim() function. The recommended method is to call net_dim() on each
+interrupt. Since Net DIM has a built-in moderation and it might decide to skip
+iterations under certain conditions, there is no need to moderate the net_dim()
+calls as well. As mentioned above, the driver needs to provide an object of type
+struct net_dim to the net_dim() function call. It is advised for each entity
+using Net DIM to hold a struct net_dim as part of its data structure and use it
+as the main Net DIM API object. The struct net_dim_sample should hold the latest
+bytes, packets and interrupts count. No need to perform any calculations, just
+include the raw data.
+
+The net_dim() call itself does not return anything. Instead Net DIM relies on the
+driver to provide a callback function, which is called when the algorithm
+decides to make a change in the interrupt moderation parameters. This callback
+will be scheduled and ran in a separate thread in order not to add overhead to
+the data flow. After the work is done, Net DIM algorithm needs to be set to
+the proper state in order to move to the next iteration.
+
+
+Part IV: Example
+=================
+
+The following code demonstrates how to register a driver to Net DIM. The actual
+usage is not complete but it should make the outline if the usage clear.
+
+my_driver.c:
+
+#include <linux/net_dim.h>
+
+/* Callback for net DIM to schedule on a decision to change moderation */
+void my_driver_do_dim_work(struct work_struct *work)
+{
+ /* Get struct net_dim from struct work_struct */
+ struct net_dim *dim = container_of(work, struct net_dim,
+ work);
+ /* Do interrupt moderation related stuff */
+ ...
+
+ /* Signal net DIM work is done and it should move to next iteration */
+ dim->state = NET_DIM_START_MEASURE;
+}
+
+/* My driver's interrupt handler */
+int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
+{
+ ...
+ /* A struct to hold current measured data */
+ struct net_dim_sample dim_sample;
+ ...
+ /* Initiate data sample struct with current data */
+ net_dim_sample(my_entity->events,
+ my_entity->packets,
+ my_entity->bytes,
+ &dim_sample);
+ /* Call net DIM */
+ net_dim(&my_entity->dim, dim_sample);
+ ...
+}
+
+/* My entity's initialization function (my_entity was already allocated) */
+int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
+{
+ ...
+ /* Initiate struct work_struct with my driver's callback function */
+ INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
+ ...
+}
--
1.8.3.1
Powered by blists - more mailing lists