[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240522033256.11960-2-anand.a.khoje@oracle.com>
Date: Wed, 22 May 2024 09:02:56 +0530
From: Anand Khoje <anand.a.khoje@...cle.com>
To: linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: anand.a.khoje@...cle.com, rama.nichanamatlu@...cle.com,
manjunath.b.patil@...cle.com
Subject: [PATCH 1/1] RDMA/mlx5: Release CPU for other processes in mlx5_free_cmd_msg()
In non FLR context, at times CX-5 requests release of ~8 million device pages.
This needs humongous number of cmd mailboxes, which to be released once
the pages are reclaimed. Release of humongous number of cmd mailboxes
consuming cpu time running into many secs, with non preemptable kernels
is leading to critical process starving on that cpu’s RQ. To alleviate
this, this patch relinquishes cpu periodically but conditionally.
Orabug: 36275016
Signed-off-by: Anand Khoje <anand.a.khoje@...cle.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 9c21bce..9fbf25d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1336,16 +1336,23 @@ static struct mlx5_cmd_msg *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
return ERR_PTR(err);
}
+#define RESCHED_MSEC 2
static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
struct mlx5_cmd_msg *msg)
{
struct mlx5_cmd_mailbox *head = msg->next;
struct mlx5_cmd_mailbox *next;
+ unsigned long start_time = jiffies;
while (head) {
next = head->next;
free_cmd_box(dev, head);
head = next;
+ if (time_after(jiffies, start_time + msecs_to_jiffies(RESCHED_MSEC))) {
+ mlx5_core_warn_rl(dev, "Spent more than %d msecs, yielding CPU\n", RESCHED_MSEC);
+ cond_resched();
+ start_time = jiffies;
+ }
}
kfree(msg);
}
--
1.8.3.1
Powered by blists - more mailing lists