[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04591dbf-57de-4d21-8009-5f462fb59c73@oracle.com>
Date: Wed, 29 May 2024 17:31:02 +0530
From: Anand Khoje <anand.a.khoje@...cle.com>
To: Shay Drori <shayd@...dia.com>, linux-rdma@...r.kernel.org,
linux-kernel@...r.kernel.org, moshe@...dia.com
Cc: rama.nichanamatlu@...cle.com, manjunath.b.patil@...cle.com,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH 1/1] RDMA/mlx5: Release CPU for other processes in
mlx5_free_cmd_msg()
On 5/26/24 20:53, Shay Drori wrote:
> Hi Anand.
>
> First, the correct Mailing list for this patch is
> netdev@...r.kernel.org, please send there the next version.
>
> On 22/05/2024 6:32, Anand Khoje wrote:
>> In non FLR context, at times CX-5 requests release of ~8 million
>> device pages.
>> This needs humongous number of cmd mailboxes, which to be released once
>> the pages are reclaimed. Release of humongous number of cmd mailboxes
>> consuming cpu time running into many secs, with non preemptable kernels
>> is leading to critical process starving on that cpu’s RQ. To alleviate
>> this, this patch relinquishes cpu periodically but conditionally.
>>
>> Orabug: 36275016
>
> this doesn't seem relevant
>
>>
>> Signed-off-by: Anand Khoje <anand.a.khoje@...cle.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> index 9c21bce..9fbf25d 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> @@ -1336,16 +1336,23 @@ static struct mlx5_cmd_msg
>> *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
>> return ERR_PTR(err);
>> }
>> +#define RESCHED_MSEC 2
>
>
> What if you add cond_resched() on every iteration of the loop ? Does it
> take much more time to finish 8 Million pages or same ?
> If it does matter, maybe 2 ms is too high freq ? 20 ms ? 200 ms ?
>
Shay,
There is no rule we could use, but can use only guidance/suggestions here.
Delay if too short/often relinquish leads to thrashing and high context
switch costs,
while keeping it long/infrequent relinquish leads to RQ starvation.
This observation is based on our applications / workload, using which a
middle ground was chosen as 2 msecs.
But your suggestions are also very viable. Hence we are reconsidering it.
This was very helpful. thank you! I will resend a v2 after more testing.
Thanks,
Anand
> Thanks
>
>> static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
>> struct mlx5_cmd_msg *msg)
>> {
>> struct mlx5_cmd_mailbox *head = msg->next;
>> struct mlx5_cmd_mailbox *next;
>> + unsigned long start_time = jiffies;
>> while (head) {
>> next = head->next;
>> free_cmd_box(dev, head);
>> head = next;
>> + if (time_after(jiffies, start_time +
>> msecs_to_jiffies(RESCHED_MSEC))) {
>> + mlx5_core_warn_rl(dev, "Spent more than %d msecs,
>> yielding CPU\n", RESCHED_MSEC);
>> + cond_resched();
>> + start_time = jiffies;
>> + }
>> }
>> kfree(msg);
>> }
Powered by blists - more mailing lists