linux-kernel - Re: [PATCH net-next 0/6] net/mlx5e: Speedup channel configuration operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <878qgajcnf.fsf@toke.dk>
Date: Thu, 13 Nov 2025 14:16:36 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Tariq Toukan <ttoukan.linux@...il.com>, Tariq Toukan
 <tariqt@...dia.com>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn
 <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>
Cc: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
 Mark Bloch <mbloch@...dia.com>, Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>, Jesper Dangaard Brouer
 <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>,
 netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
 linux-kernel@...r.kernel.org, bpf@...r.kernel.org, Gal Pressman
 <gal@...dia.com>, Leon Romanovsky <leonro@...dia.com>, Moshe Shemesh
 <moshe@...dia.com>, William Tu <witu@...dia.com>, Dragos Tatulea
 <dtatulea@...dia.com>, Nimrod Oren <noren@...dia.com>, Alex Lazar
 <alazar@...dia.com>
Subject: Re: [PATCH net-next 0/6] net/mlx5e: Speedup channel configuration
 operations

Tariq Toukan <ttoukan.linux@...il.com> writes:

> On 12/11/2025 18:33, Toke Høiland-Jørgensen wrote:
>> Tariq Toukan <ttoukan.linux@...il.com> writes:
>> 
>>> On 12/11/2025 12:54, Toke Høiland-Jørgensen wrote:
>>>> Tariq Toukan <tariqt@...dia.com> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> This series significantly improves the latency of channel configuration
>>>>> operations, like interface up (create channels), interface down (destroy
>>>>> channels), and channels reconfiguration (create new set, destroy old
>>>>> one).
>>>>
>>>> On the topic of improving ifup/ifdown times, I noticed at some point
>>>> that mlx5 will call synchronize_net() once for every queue when they are
>>>> deactivated (in mlx5e_deactivate_txqsq()). Have you considered changing
>>>> that to amortise the sync latency over the full interface bringdown? :)
>>>>
>>>> -Toke
>>>>
>>>>
>>>
>>> Correct!
>>> This can be improved and I actually have WIP patches for this, as I'm
>>> revisiting this code area recently.
>> 
>> Excellent! We ran into some issues with this a while back, so would be
>> great to see this improved.
>> 
>> -Toke
>> 
>
> Can you elaborate on the test case and issues encountered?
> To make sure I'm addressing them.

Sure, thanks for taking a look!

The high-level issue we've been seeing involves long delays creating and
tearing down OpenShift (Kubernetes) pods that have SR-IOV devices
assigned to them. The worst example of involved a test that basically
reboots an application (tearing down its pods and immediately recreating
them), which takes up to ~10 minutes for ~100 pods.

Because a lot of the wait happens with the RNTL held, we also get
cascading errors to other parts of the system. This is how I ended up
digging into what the mlx5 driver was doing while holding the RTNL,
which is where I noticed the "synchronize_net() in a loop" behaviour.

We're working on reducing the blast radius of the RTNL in general, but
the setup/teardown time seems to be driver specific, so any improvements
here would be welcome, I guess :)

-Toke