[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKq9yRgO3akVUoz=H_vKgMjoDowq=owq5snPhmKLi4c=taLTnA@mail.gmail.com>
Date: Sat, 16 Mar 2024 01:39:07 +0100
From: Daniele Salvatore Albano <d.albano@...il.com>
To: netdev@...r.kernel.org
Subject: [mlx5_core] kernel NULL pointer dereference when sending packets with
AF_XDP using the hw checksum
Hey there,
Hope this is the right ml, if not sorry in advance.
I have been facing a reproducible kernel panic with 6.8.0 and 6.8.1
when sending packets and enabling the HW checksum calculation with
AF_XDP on my mellanox connect 5.
Running xskgen ( https://github.com/fomichev/xskgen ), which I saw
mentioned in some patches related to AF_XDP and the hw checksum
support. In addition to the minimum parameters to make it work, adding
the -m option is enough to trigger the kernel panic.
This is a mainline kernel from ubuntu.
Below the output from dmesg
[ 157.108211] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 157.108264] #PF: supervisor write access in kernel mode
[ 157.108284] #PF: error_code(0x0002) - not-present page
[ 157.108304] PGD 302a724067 P4D 302a724067 PUD 3027e99067 PMD 0
[ 157.108332] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 157.108352] CPU: 19 PID: 132 Comm: ksoftirqd/19 Not tainted
6.8.0-060800-generic #202403131158
[ 157.108379] Hardware name: Supermicro Super Server/H11SSL-i, BIOS
2.1 02/21/2020
[ 157.108402] RIP: 0010:mlx5e_free_xdpsq_desc+0x266/0x320 [mlx5_core]
[ 157.108576] Code: 94 24 58 02 00 00 49 8b 8c 24 50 02 00 00 48 8d
7d c0 8b 02 8d 70 01 89 32 41 23 84 24 68 02 00 00 4c 8b 2c c1 e8 ca
fc ff ff <49> 89 45 00 e9 ce fe ff ff 41 8b 47 20 41 0f b7 57 0a 48 2d
68 01
[ 157.108626] RSP: 0018:ffffa8668cd13b90 EFLAGS: 00010246
[ 157.108647] RAX: 17bd161cd26e8f20 RBX: 0000000000000000 RCX: 0000000000000000
[ 157.108670] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 157.108693] RBP: ffffa8668cd13c08 R08: 0000000000000000 R09: 0000000000000000
[ 157.108715] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d5e420d3340
[ 157.108737] R13: 0000000000000000 R14: ffffffffffffffff R15: 0000000000000000
[ 157.108759] FS: 0000000000000000(0000) GS:ffff8d6ddf780000(0000)
knlGS:0000000000000000
[ 157.108784] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 157.108804] CR2: 0000000000000000 CR3: 0000003028e5c000 CR4: 00000000003506f0
[ 157.108827] Call Trace:
[ 157.108841] <TASK>
[ 157.108855] ? show_regs+0x6d/0x80
[ 157.108876] ? __die+0x24/0x80
[ 157.108893] ? page_fault_oops+0x99/0x1b0
[ 157.108916] ? do_user_addr_fault+0x2ee/0x6b0
[ 157.108937] ? exc_page_fault+0x83/0x1b0
[ 157.108958] ? asm_exc_page_fault+0x27/0x30
[ 157.108986] ? mlx5e_free_xdpsq_desc+0x266/0x320 [mlx5_core]
[ 157.109154] mlx5e_poll_xdpsq_cq+0x17c/0x4f0 [mlx5_core]
[ 157.109324] mlx5e_napi_poll+0x45e/0x7b0 [mlx5_core]
[ 157.109470] __napi_poll+0x33/0x200
[ 157.109488] net_rx_action+0x181/0x2e0
[ 157.109502] ? sched_clock_cpu+0x12/0x1e0
[ 157.109524] __do_softirq+0xe1/0x363
[ 157.109544] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 157.109565] run_ksoftirqd+0x37/0x60
[ 157.109582] smpboot_thread_fn+0xe3/0x1e0
[ 157.109600] kthread+0xf2/0x120
[ 157.109616] ? __pfx_kthread+0x10/0x10
[ 157.109632] ret_from_fork+0x47/0x70
[ 157.109648] ? __pfx_kthread+0x10/0x10
[ 157.109663] ret_from_fork_asm+0x1b/0x30
[ 157.109686] </TASK>
[ 157.109696] Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack xt_comment ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
nf_tables nfnetlink cfg80211 binfmt_misc nls_iso8859_1 intel_rapl_msr
intel_rapl_common amd64_edac edac_mce_amd kvm_amd ipmi_ssif kvm
irqbypass rapl acpi_ipmi ccp k10temp ipmi_si ipmi_devintf joydev
input_leds ipmi_msghandler mac_hid br_netfilter dm_multipath bridge
scsi_dh_rdac scsi_dh_emc stp llc scsi_dh_alua overlay msr efi_pstore
dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 mlx5_ib ib_uverbs macsec ib_core
hid_generic usbhid hid mlx5_core crct10dif_pclmul crc32_pclmul
polyval_clmulni polyval_generic ghash_clmulni_intel mlxfw sha256_ssse3
psample nvme sha1_ssse3 igb tls ahci nvme_core ast pci_hyperv_intf
libahci dca i2c_piix4 xhci_pci nvme_auth i2c_algo_bit xhci_pci_renesas
aesni_intel crypto_simd cryptd
[ 157.113195] CR2: 0000000000000000
[ 157.113607] ---[ end trace 0000000000000000 ]---
[ 157.877621] clocksource: Long readout interval, skipping watchdog
check: cs_nsec: 1263523800 wd_nsec: 1263521131
Thanks,
Daniele
Powered by blists - more mailing lists