lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <cb1fc819-f761-4d25-8c49-674d72462f95@app.fastmail.com>
Date: Tue, 03 Feb 2026 19:54:31 +0200
From: "Alice Mikityanska" <alice.kernel@...tmail.im>
To: "Tariq Toukan" <ttoukan.linux@...il.com>,
 "Daniel Borkmann" <daniel@...earbox.net>, netdev@...r.kernel.org
Cc: "William Tu" <witu@...dia.com>, "Tariq Toukan" <tariqt@...dia.com>,
 "David Wei" <dw@...idwei.uk>, "Jakub Kicinski" <kuba@...nel.org>,
 "Gal Pressman" <gal@...dia.com>
Subject: Re: [PATCH net-next] net/mlx5e: Undo saving per-channel async ICOSQ

On Tue, Feb 3, 2026, at 16:58, Tariq Toukan wrote:
> On 03/02/2026 11:20, Alice Mikityanska wrote:
>> On Mon, Feb 2, 2026, at 18:13, Daniel Borkmann wrote:
>>> Hi Tariq,
>>>
>>> On 2/1/26 1:50 PM, Tariq Toukan wrote:
>>>> On 26/01/2026 11:23, Daniel Borkmann wrote:
>>>>> On 1/25/26 9:33 AM, Tariq Toukan wrote:
>>>>>> On 24/01/2026 0:39, Daniel Borkmann wrote:
>>>>>>> This reverts the following commits:
>>>>>>>
>>>>>>>     - ea945f4f3991 ("net/mlx5e: Move async ICOSQ lock into ICOSQ struct")
>>>>>>>     - 56aca3e0f730 ("net/mlx5e: Use regular ICOSQ for triggering NAPI")
>>>>>>>     - 1b080bd74840 ("net/mlx5e: Move async ICOSQ to dynamic allocation")
>>>>>>>     - abed42f9cd80 ("net/mlx5e: Conditionally create async ICOSQ")
>>>>>>>
>>>>>>> There are a couple of regressions on the xsk side I ran into:
>>>>>>>
>>>>>>> Commit 56aca3e0f730 triggers an illegal synchronize_rcu() in an RCU read-
>>>>>>> side critical section via mlx5e_xsk_wakeup() -> mlx5e_trigger_napi_icosq()
>>>>>>> -> synchronize_net(). The stack holds RCU read-lock in xsk_poll().
>>>>>>>
>>>>>>> Additionally, this also hits a NULL pointer dereference in mlx5e_xsk_wakeup():
>>>>>>>
>>>>>>>     [  103.963735] BUG: kernel NULL pointer dereference, address: 0000000000000240
>>>>>>>     [  103.963743] #PF: supervisor read access in kernel mode
>>>>>>>     [  103.963746] #PF: error_code(0x0000) - not-present page
>>>>>>>     [  103.963749] PGD 0 P4D 0
>>>>>>>     [  103.963752] Oops: Oops: 0000 [#1] SMP
>>>>>>>     [  103.963756] CPU: 0 UID: 0 PID: 2255 Comm: qemu-system-x86 Not tainted 6.19.0-rc5+ #229 PREEMPT(none)
>>>>>>>     [  103.963761] Hardware name: [...]
>>>>>>>     [  103.963765] RIP: 0010:mlx5e_xsk_wakeup+0x53/0x90 [mlx5_core]
>>>>>>>
>>>>>>> What happens is that c->async_icosq is NULL when in mlx5e_xsk_wakeup()
>>>>>>> and therefore access to c->async_icosq->state triggers it. (On the NIC
>>>>>>> there is an XDP program installed by the control plane where traffic
>>>>>>> gets redirected into an xsk map - there was no xsk pool set up yet.
>>>>>>> At some later time a xsk pool is set up and the related xsk socket is
>>>>>>> added to the xsk map of the XDP program.)
>>>>>>
>>>>>> Thanks for your report.
>>>>>>
>>>>>>> Reverting the series fixes the problems again.
>>>>>>
>>>>>> Revert is too aggressive here. A fix is preferable.
>>>>>> We're investigating the issue in order to fix it.
>>>>>> We'll update.
>>>>> Ok, sounds good. Certainly the kTLS fixes seem independent, from the cause
>>>>> of the issues I've hit it just seemed to me that they were quite fundamental
>>>>> and that perhaps a different approach would be needed (or alternatively only
>>>>> kTLS would need fixing, and the xsk optimization left as it was originally).
>>>>> Anyway, I'll keep the revert locally for now, and happy to test patches.
>>>>
>>>> Please check attached patch.
>> 
>> Hi Tariq,
>> 
>> I also took a glance at the patch: that seems to be correct in XSK parts, and mlx5e_trigger_napi_icosq now makes sense to me after your explanation. 
>
> Great.
>
>> Two small comments though:
>> 
>> 1. I'd prefer DEBUG_NET_WARN_ON_ONCE instead of WARN_ON. At least XSK wakeup is a hot path, called frequently, so we don't want to flood dmesg with the same error, and it would be better to compile out the check in non-debug builds, hence DEBUG_NET_WARN_ON_ONCE.
>> 
>
> I tend to totally drop this check.
>
>> 2. I see you changed the condition of creating async_icosq to be created when any xdp_prog is attached, even when there are no XSK pools. Is there a case where async_icosq is useful with plain XDP without XSK? If not, I believe this condition should be XDP && XSK.
>> 
>
> You're right.
> As you know, the channels are not re-opened on XSK pool events.
> They are re-opened on XDP program attach/detach.

Ah, that's right, I see. I didn't pay attention to where the check is
located. If you reorganize the code a little bit, it's possible to
implement the perfect check (XDP && XSK) || kTLS_RX, to avoid creating
the unneeded async_icosq with plain XDP.

(BTW, I see that ktls_rx_was_enabled is never reset to false: is it
because we can't reliably fence the moment when the driver stops using
it?)

Anyway, it's up to you if you'd like to do it or not, but here is my
suggestion. We don't have to create async_icosq in mlx5e_open_queues.
Similar to how we separated mlx5e_open_xsk, we can separate
mlx5e_open_async_icosq, which is where we can check (XDP && XSK) ||
kTLS_RX.

Then mlx5e_open_channel will look like:

mlx5e_open_queues();
mlx5e_open_async_icosq(); // Checks the condition internally
if (xsk_pool)
        mlx5e_open_xsk();

mlx5e_xsk_enable_locked will also call mlx5e_open_async_icosq just
before mlx5e_open_xsk.

mlx5e_close_async_icosq will close it if it exists.

With this design, all cases are covered: every mlx5e_open_xsk is paired
with mlx5e_open_async_icosq; and kTLS RX enable flow reopens channels,
which also triggers mlx5e_open_async_icosq. The conditional is DRYed in
mlx5e_open_async_icosq itself, so no code repetition either.

> Hence, as a simple and quick fix, we will check for XDP program and 
> create the async ICOSQ accordingly.
> Currently we won't have create/destroy async ICOSQ when XSK pool is 
> activated/deactivated. I will mention this in the commit message.
>
> This still prevents the async ICOSQ creation in default.
>
>> Thanks,
>> Alice
>> 
>
>
>>>> We were able to repro the issues internally and verify the fix.
>>>> We're finalizing it before submission.
>>>>
>>>> I'd be glad if you can confirm it solves the issues for you.
>>> That seems to work for me, yes. Feel free to add my Tested-by.
>>>
>>> Thanks,
>>> Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ