netdev - Re: [PATCH 1/1] RDMA/mlx5: Release CPU for other processes in mlx5_free_cmd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <04591dbf-57de-4d21-8009-5f462fb59c73@oracle.com>
Date: Wed, 29 May 2024 17:31:02 +0530
From: Anand Khoje <anand.a.khoje@...cle.com>
To: Shay Drori <shayd@...dia.com>, linux-rdma@...r.kernel.org,
        linux-kernel@...r.kernel.org, moshe@...dia.com
Cc: rama.nichanamatlu@...cle.com, manjunath.b.patil@...cle.com,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH 1/1] RDMA/mlx5: Release CPU for other processes in
 mlx5_free_cmd_msg()


On 5/26/24 20:53, Shay Drori wrote:
> Hi Anand.
>
> First, the correct Mailing list for this patch is
> netdev@...r.kernel.org, please send there the next version.
>
> On 22/05/2024 6:32, Anand Khoje wrote:
>> In non FLR context, at times CX-5 requests release of ~8 million 
>> device pages.
>> This needs humongous number of cmd mailboxes, which to be released once
>> the pages are reclaimed. Release of humongous number of cmd mailboxes
>> consuming cpu time running into many secs, with non preemptable kernels
>> is leading to critical process starving on that cpu’s RQ. To alleviate
>> this, this patch relinquishes cpu periodically but conditionally.
>>
>> Orabug: 36275016
>
> this doesn't seem relevant
>
>>
>> Signed-off-by: Anand Khoje <anand.a.khoje@...cle.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
>> b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> index 9c21bce..9fbf25d 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
>> @@ -1336,16 +1336,23 @@ static struct mlx5_cmd_msg 
>> *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
>>       return ERR_PTR(err);
>>   }
>>   +#define RESCHED_MSEC 2
>
>
> What if you add cond_resched() on every iteration of the loop ? Does it
> take much more time to finish 8 Million pages or same ?
> If it does matter, maybe 2 ms is too high freq ? 20 ms ? 200 ms ?
>
Shay,


There is no rule we could use, but can use only guidance/suggestions here.
Delay if too short/often relinquish leads to thrashing and high context 
switch costs,
while keeping it long/infrequent relinquish leads to RQ starvation.
This observation is based  on our applications / workload, using which a 
middle ground was chosen as 2 msecs.
But your suggestions are also very viable. Hence we are reconsidering it.

This was very helpful. thank you! I will resend a v2 after more testing.

Thanks,

Anand


> Thanks
>
>>   static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
>>                     struct mlx5_cmd_msg *msg)
>>   {
>>       struct mlx5_cmd_mailbox *head = msg->next;
>>       struct mlx5_cmd_mailbox *next;
>> +    unsigned long start_time = jiffies;
>>         while (head) {
>>           next = head->next;
>>           free_cmd_box(dev, head);
>>           head = next;
>> +        if (time_after(jiffies, start_time + 
>> msecs_to_jiffies(RESCHED_MSEC))) {
>> +            mlx5_core_warn_rl(dev, "Spent more than %d msecs, 
>> yielding CPU\n", RESCHED_MSEC);
>> +            cond_resched();
>> +            start_time = jiffies;
>> +        }
>>       }
>>       kfree(msg);
>>   }