[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <jlvrzm6q7dnai6nf5v3ifhtwqlnvvrdg5driqomnl5q4lzfxmk@tmwaadjob5yd>
Date: Thu, 24 Jul 2025 17:01:16 +0000
From: Dragos Tatulea <dtatulea@...dia.com>
To: Chris Arges <carges@...udflare.com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
kernel-team <kernel-team@...udflare.com>, Jesper Dangaard Brouer <hawk@...nel.org>, tariqt@...dia.com,
saeedm@...dia.com, Leon Romanovsky <leon@...nel.org>,
Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>,
Simon Horman <horms@...nel.org>, Andrew Rzeznik <arzeznik@...udflare.com>,
Yan Zhai <yan@...udflare.com>
Subject: Re: [BUG] mlx5_core memory management issue
On Wed, Jul 23, 2025 at 01:48:07PM -0500, Chris Arges wrote:
>
> Ok, we can reproduce this problem!
>
> I tried to simplify this reproducer, but it seems like what's needed is:
> - xdp program attached to mlx5 NIC
> - cpumap redirect
> - device redirect (map or just bpf_redirect)
> - frame gets turned into an skb
> Then from another machine send many flows of UDP traffic to trigger the problem.
>
> I've put together a program that reproduces the issue here:
> - https://github.com/arges/xdp-redirector
>
Much appreciated! I fumbled around initially, not managing to get
traffic to the xdp_devmap stage. But further debugging revealed that GRO
needs to be enabled on the veth devices for XDP redir to work to the
xdp_devmap. After that I managed to reproduce your issue.
Now I can start looking into it.
> In general the failure manifests with many different WARNs such as:
> include/net/page_pool/helpers.h:277 mlx5e_page_release_fragmented.isra.0+0xf7/0x150 [mlx5_core]
> Then the machine crashes.
>
> I was able to get a crashdump which shows:
> ```
> PID: 0 TASK: ffff8c0910134380 CPU: 76 COMMAND: "swapper/76"
> #0 [fffffe10906d3ea8] crash_nmi_callback at ffffffffadc5c4fd
> #1 [fffffe10906d3eb0] default_do_nmi at ffffffffae9524f0
> #2 [fffffe10906d3ed0] exc_nmi at ffffffffae952733
> #3 [fffffe10906d3ef0] end_repeat_nmi at ffffffffaea01bfd
> [exception RIP: io_serial_in+25]
> RIP: ffffffffae4cd489 RSP: ffffb3c60d6049e8 RFLAGS: 00000002
> RAX: ffffffffae4cd400 RBX: 00000000000025d8 RCX: 0000000000000000
> RDX: 00000000000002fd RSI: 0000000000000005 RDI: ffffffffb10a9cb0
> RBP: 0000000000000000 R8: 2d2d2d2d2d2d2d2d R9: 656820747563205b
> R10: 000000002d2d2d2d R11: 000000002d2d2d2d R12: ffffffffb0fa5610
> R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffb10a9cb0
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> --- <NMI exception stack> ---
> #4 [ffffb3c60d6049e8] io_serial_in at ffffffffae4cd489
> #5 [ffffb3c60d6049e8] serial8250_console_write at ffffffffae4d2fcf
> #6 [ffffb3c60d604a80] console_flush_all at ffffffffadd1cf26
> #7 [ffffb3c60d604b00] console_unlock at ffffffffadd1d1df
> #8 [ffffb3c60d604b48] vprintk_emit at ffffffffadd1dda1
> #9 [ffffb3c60d604b98] _printk at ffffffffae90250c
> #10 [ffffb3c60d604bf8] report_bug.cold at ffffffffae95001d
> #11 [ffffb3c60d604c38] handle_bug at ffffffffae950e91
> #12 [ffffb3c60d604c58] exc_invalid_op at ffffffffae9512b7
> #13 [ffffb3c60d604c70] asm_exc_invalid_op at ffffffffaea0123a
> [exception RIP: mlx5e_page_release_fragmented+85]
> RIP: ffffffffc25f75c5 RSP: ffffb3c60d604d20 RFLAGS: 00010293
> RAX: 000000000000003f RBX: ffff8bfa8f059fd0 RCX: ffffe3bf1992a180
> RDX: 000000000000003d RSI: ffffe3bf1992a180 RDI: ffff8bf9b0784000
> RBP: 0000000000000040 R8: 00000000000001d2 R9: 0000000000000006
> R10: ffff8c06de22f380 R11: ffff8bfcfe6cd680 R12: 00000000000001d2
> R13: 000000000000002b R14: ffff8bf9b0784000 R15: 0000000000000000
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #14 [ffffb3c60d604d20] mlx5e_free_rx_wqes at ffffffffc25f7e2f [mlx5_core]
> #15 [ffffb3c60d604d58] mlx5e_post_rx_wqes at ffffffffc25f877c [mlx5_core]
> #16 [ffffb3c60d604dc0] mlx5e_napi_poll at ffffffffc25fdd27 [mlx5_core]
> #17 [ffffb3c60d604e20] __napi_poll at ffffffffae6a8ddb
> #18 [ffffb3c60d604e90] __napi_poll at ffffffffae6a8db5
> #19 [ffffb3c60d604e98] net_rx_action at ffffffffae6a95f1
> #20 [ffffb3c60d604f98] handle_softirqs at ffffffffadc9d4bf
> #21 [ffffb3c60d604fe8] irq_exit_rcu at ffffffffadc9e057
> #22 [ffffb3c60d604ff0] common_interrupt at ffffffffae952015
> --- <IRQ stack> ---
> #23 [ffffb3c60c837de8] asm_common_interrupt at ffffffffaea01466
> [exception RIP: cpuidle_enter_state+184]
> RIP: ffffffffae955c38 RSP: ffffb3c60c837e98 RFLAGS: 00000202
> RAX: ffff8c0cffc00000 RBX: ffff8c0911002400 RCX: 0000000000000000
> RDX: 00003c630b2d073a RSI: ffffffe519600d10 RDI: 0000000000000000
> RBP: 0000000000000001 R8: 0000000000000002 R9: 0000000000000001
> R10: ffff8c0cffc330c4 R11: 071c71c71c71c71c R12: ffffffffb05ff820
> R13: 00003c630b2d073a R14: 0000000000000001 R15: 0000000000000000
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #24 [ffffb3c60c837ed0] cpuidle_enter at ffffffffae64b4ad
> #25 [ffffb3c60c837ef0] do_idle at ffffffffadcfa7c6
> #26 [ffffb3c60c837f30] cpu_startup_entry at ffffffffadcfaa09
> #27 [ffffb3c60c837f40] start_secondary at ffffffffadc5ec77
> #28 [ffffb3c60c837f50] common_startup_64 at ffffffffadc24d5d
> ```
>
> Assuming (this is x86_64):
> RDI=ffff8bf9b0784000 (rq)
> RSI=ffffe3bf1992a180 (frag_page)
>
> ```
> static void mlx5e_page_release_fragmented(struct mlx5e_rq *rq,
> struct mlx5e_frag_page *frag_page)
> {
> u16 drain_count = MLX5E_PAGECNT_BIAS_MAX - frag_page->frags;
> struct page *page = frag_page->page;
>
> if (page_pool_unref_page(page, drain_count) == 0)
> page_pool_put_unrefed_page(rq->page_pool, page, -1, true);
> }
> ```
>
> crash> struct mlx5e_frag_page ffffe3bf1992a180
> struct mlx5e_frag_page {
> page = 0x26ffff800000000,
> frags = 49856
> }
>
Most incorrect fragment counting issues have a tendency to show up here.
Thanks,
Dragos
Powered by blists - more mailing lists