[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <025aeb3d5b1c4f25b39d9a041556e1d703615e8f.1714134205.git.petrm@nvidia.com>
Date: Fri, 26 Apr 2024 14:42:24 +0200
From: Petr Machata <petrm@...dia.com>
To: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, <netdev@...r.kernel.org>
CC: Ido Schimmel <idosch@...dia.com>, Petr Machata <petrm@...dia.com>, "Amit
Cohen" <amcohen@...dia.com>, <mlxsw@...dia.com>
Subject: [PATCH net-next 3/5] mlxsw: pci: Initialize dummy net devices for NAPI
From: Amit Cohen <amcohen@...dia.com>
mlxsw will use NAPI for event processing in a next patch. As preparation,
add two dummy net devices and initialize them.
NAPI instance should be attached to net device. Usually each queue is used
by a single net device in network drivers, so the mapping between net
device to NAPI instance is intuitive. In our case, Rx queues are not per
port, they are per trap-group. Tx queues are mapped to net devices, but we
do not have a separate queue for each local port, several ports share the
same queue.
Use init_dummy_netdev() to initialize dummy net devices for NAPI.
To run NAPI poll method in a kernel thread, the net device which NAPI
instance is attached to should be marked as 'threaded'. It is
recommended to handle Tx packets in softIRQ context, as usually this is
a short task - just free the Tx packet which has been transmitted.
Rx packets handling is more complicated task, so drivers can use a
dedicated kernel thread to process them. It allows processing packets from
different Rx queues in parallel. We would like to handle only Rx packets in
kernel threads, which means that we will use two dummy net devices
(one for Rx and one for Tx). Set only one of them with 'threaded' as it
will be used for Rx processing. Do not fail in case that setting 'threaded'
fails, as it is better to use regular softIRQ NAPI rather than preventing
the driver from loading.
Note that the net devices are initialized with init_dummy_netdev(), so
they are not registered, which means that they will not be visible to user.
It will not be possible to change 'threaded' configuration from user
space, but it is reasonable in our case, as there is no another
configuration which makes sense, considering that user has no influence
on the usage of each queue.
Signed-off-by: Amit Cohen <amcohen@...dia.com>
Reviewed-by: Ido Schimmel <idosch@...dia.com>
Signed-off-by: Petr Machata <petrm@...dia.com>
---
drivers/net/ethernet/mellanox/mlxsw/pci.c | 41 +++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 2094b802d8d5..ec54b876dfd9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -127,8 +127,42 @@ struct mlxsw_pci {
u8 num_cqs; /* Number of CQs */
u8 num_sdqs; /* Number of SDQs */
bool skip_reset;
+ struct net_device *napi_dev_tx;
+ struct net_device *napi_dev_rx;
};
+static int mlxsw_pci_napi_devs_init(struct mlxsw_pci *mlxsw_pci)
+{
+ int err;
+
+ mlxsw_pci->napi_dev_tx = alloc_netdev_dummy(0);
+ if (!mlxsw_pci->napi_dev_tx)
+ return -ENOMEM;
+ strscpy(mlxsw_pci->napi_dev_tx->name, "mlxsw_tx",
+ sizeof(mlxsw_pci->napi_dev_tx->name));
+
+ mlxsw_pci->napi_dev_rx = alloc_netdev_dummy(0);
+ if (!mlxsw_pci->napi_dev_rx) {
+ err = -ENOMEM;
+ goto err_alloc_rx;
+ }
+ strscpy(mlxsw_pci->napi_dev_rx->name, "mlxsw_rx",
+ sizeof(mlxsw_pci->napi_dev_rx->name));
+ dev_set_threaded(mlxsw_pci->napi_dev_rx, true);
+
+ return 0;
+
+err_alloc_rx:
+ free_netdev(mlxsw_pci->napi_dev_tx);
+ return err;
+}
+
+static void mlxsw_pci_napi_devs_fini(struct mlxsw_pci *mlxsw_pci)
+{
+ free_netdev(mlxsw_pci->napi_dev_rx);
+ free_netdev(mlxsw_pci->napi_dev_tx);
+}
+
static void mlxsw_pci_queue_tasklet_schedule(struct mlxsw_pci_queue *q)
{
tasklet_schedule(&q->tasklet);
@@ -1721,6 +1755,10 @@ static int mlxsw_pci_init(void *bus_priv, struct mlxsw_core *mlxsw_core,
if (err)
goto err_requery_resources;
+ err = mlxsw_pci_napi_devs_init(mlxsw_pci);
+ if (err)
+ goto err_napi_devs_init;
+
err = mlxsw_pci_aqs_init(mlxsw_pci, mbox);
if (err)
goto err_aqs_init;
@@ -1738,6 +1776,8 @@ static int mlxsw_pci_init(void *bus_priv, struct mlxsw_core *mlxsw_core,
err_request_eq_irq:
mlxsw_pci_aqs_fini(mlxsw_pci);
err_aqs_init:
+ mlxsw_pci_napi_devs_fini(mlxsw_pci);
+err_napi_devs_init:
err_requery_resources:
err_config_profile:
err_cqe_v_check:
@@ -1765,6 +1805,7 @@ static void mlxsw_pci_fini(void *bus_priv)
free_irq(pci_irq_vector(mlxsw_pci->pdev, 0), mlxsw_pci);
mlxsw_pci_aqs_fini(mlxsw_pci);
+ mlxsw_pci_napi_devs_fini(mlxsw_pci);
mlxsw_pci_fw_area_fini(mlxsw_pci);
mlxsw_pci_free_irq_vectors(mlxsw_pci);
}
--
2.43.0
Powered by blists - more mailing lists