lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOkpcpO0KwtOZb0WE5kw+hec8nN9cWGarjT7dupw3Z+UQ@mail.gmail.com>
Date: Wed, 16 Jul 2025 12:16:11 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Tariq Toukan <tariqt@...dia.com>
Cc: Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Saeed Mahameed <saeedm@...dia.com>, 
	Leon Romanovsky <leon@...nel.org>, Mark Bloch <mbloch@...dia.com>, netdev@...r.kernel.org, 
	linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Dragos Tatulea <dtatulea@...dia.com>
Subject: Re: [PATCH net-next] net/mlx5e: TX, Fix dma unmapping for devmem tx

On Wed, Jul 16, 2025 at 12:01 AM Tariq Toukan <tariqt@...dia.com> wrote:
>
> From: Dragos Tatulea <dtatulea@...dia.com>
>
> net_iovs should have the dma address set to 0 so that
> netmem_dma_unmap_page_attrs() correctly skips the unmap. This was
> not done in mlx5 when support for devmem tx was added and resulted
> in the splat below when the platform iommu was enabled.
>
> This patch addresses the issue by using netmem_dma_unmap_addr_set()
> which handles the net_iov case when setting the dma address. A small
> refactoring of mlx5e_dma_push() was required to be able to use this API.
> The function was split in two versions and each version called
> accordingly. Note that netmem_dma_unmap_addr_set() introduces an
> additional if case.
>
> Splat:
>   WARNING: CPU: 14 PID: 2587 at drivers/iommu/dma-iommu.c:1228 iommu_dma_unmap_page+0x7d/0x90
>   Modules linked in: [...]
>   Unloaded tainted modules: i10nm_edac(E):1 fjes(E):1
>   CPU: 14 UID: 0 PID: 2587 Comm: ncdevmem Tainted: G S          E       6.15.0+ #3 PREEMPT(voluntary)
>   Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>   Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 06/01/2022
>   RIP: 0010:iommu_dma_unmap_page+0x7d/0x90
>   Code: [...]
>   RSP: 0000:ff6b1e3ea0b2fc58 EFLAGS: 00010246
>   RAX: 0000000000000000 RBX: ff46ef2d0a2340c8 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
>   RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8827a120
>   R10: 0000000000000000 R11: 0000000000000000 R12: 00000000d8000000
>   R13: 0000000000000008 R14: 0000000000000001 R15: 0000000000000000
>   FS:  00007feb69adf740(0000) GS:ff46ef2c779f1000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007feb69cca000 CR3: 0000000154b97006 CR4: 0000000000773ef0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>    <TASK>
>    dma_unmap_page_attrs+0x227/0x250
>    mlx5e_poll_tx_cq+0x163/0x510 [mlx5_core]
>    mlx5e_napi_poll+0x94/0x720 [mlx5_core]
>    __napi_poll+0x28/0x1f0
>    net_rx_action+0x33a/0x420
>    ? mlx5e_completion_event+0x3d/0x40 [mlx5_core]
>    handle_softirqs+0xe8/0x2f0
>    __irq_exit_rcu+0xcd/0xf0
>    common_interrupt+0x47/0xa0
>    asm_common_interrupt+0x26/0x40
>   RIP: 0033:0x7feb69cd08ec
>   Code: [...]
>   RSP: 002b:00007ffc01b8c880 EFLAGS: 00000246
>   RAX: 00000000c3a60cf7 RBX: 0000000000045e12 RCX: 000000000000000e
>   RDX: 00000000000035b4 RSI: 0000000000000000 RDI: 00007ffc01b8c8c0
>   RBP: 00007ffc01b8c8b0 R08: 0000000000000000 R09: 0000000000000064
>   R10: 00007ffc01b8c8c0 R11: 0000000000000000 R12: 00007feb69cca000
>   R13: 00007ffc01b90e48 R14: 0000000000427e18 R15: 00007feb69d07000
>    </TASK>
>
> Cc: Mina Almasry <almasrymina@...gle.com>
> Reported-by: Stanislav Fomichev <stfomichev@...il.com>
> Closes: https://lore.kernel.org/all/aFM6r9kFHeTdj-25@mini-arch/
> Fixes: 5a842c288cfa ("net/mlx5e: Add TX support for netmems")
> Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
> Reviewed-by: Carolina Jubran <cjubran@...dia.com>
> Signed-off-by: Tariq Toukan <tariqt@...dia.com>


Hmm, a couple of issues I see, but I'm not sure if it's really wrong
or my own non-understanding.

I don't see  netmem_dma_unmap_page_attrs called anywhere in your
driver. The point of  netmem_dma_unmap_addr_set setting the addr to 0
is that a later call to netmem_dma_unmap_page_attrs skips the
dma-unmap call if it's 0.

I could not understand why the mlx5/core/* files still used the
non-netmem variant. The netdev->netmem_tx was set to true in
mlx5/core/en_main.c, so I would have thought at least all the
mlx5/core/en_tx.c callsites should have gone to that.

-- 
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ