2024-05-08 11:08:45.728 May 8 09:08:45 10.211.164.42 [777338.518329] mlx5_core 0000:18:00.1: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-08 11:08:45.728 May 8 09:08:45 10.211.164.42 [777338.538411] mlx5_core 0000:18:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-08 11:08:47.482 May 8 09:08:47 10.211.164.42 [777340.386168] mlx5_core 0000:18:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-08 11:08:48.735 May 8 09:08:48 10.211.164.42 [777341.519135] mlx5_core 0000:18:00.1: E-Switch: cleanup 2024-05-08 11:09:49.649 May 8 09:09:49 10.211.164.42 [777402.547905] mlx5_core 0000:18:00.1: wait_func:1155:(pid 3050225): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource 2024-05-08 11:09:49.649 May 8 09:09:49 10.211.164.42 [777402.561549] mlx5_core 0000:18:00.1: mlx5_function_close:1288:(pid 3050225): tear_down_hca failed, skip cleanup 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.668373] pci 0000:18:00.1: [15b3:1013] type 00 class 0x020000 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.675412] pci 0000:18:00.1: reg 0x10: [mem 0x387ffc000000-0x387ffdffffff 64bit pref] 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.684187] pci 0000:18:00.1: reg 0x30: [mem 0xfff00000-0xffffffff pref] 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.691841] pci 0000:18:00.1: PME# supported from D3cold 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.699069] pci 0000:18:00.1: Adding to iommu group 9 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.705201] pci 0000:18:00.1: BAR 0: assigned [mem 0x387ffc000000-0x387ffdffffff 64bit pref] 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.714597] pci 0000:18:00.1: BAR 6: assigned [mem 0x9da00000-0x9dafffff pref] 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.725140] mlx5_core 0000:18:00.1: firmware version: 12.28.2006 2024-05-08 11:09:49.899 May 8 09:09:49 10.211.164.42 [777402.731694] mlx5_core 0000:18:00.1: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link) 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.942796] BUG: unable to handle page fault for address: ffffb8ecc6ea8230 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.950173] #PF: supervisor read access in kernel mode 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.955703] #PF: error_code(0x0000) - not-present page 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.961159] PGD 100c00067 P4D 100c00067 PUD 100e54067 PMD 10601c067 PTE 0 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.968266] Oops: 0000 [#1] PREEMPT SMP NOPTI 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.972943] CPU: 0 PID: 3053824 Comm: reactor_0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.983527] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.994284] RIP: 0010:ioread32be+0x34/0x60 2024-05-08 11:09:50.150 May 8 09:09:49 10.211.164.42 [777402.998712] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac bc 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.018096] RSP: 0000:ffffb8eccb5efda0 EFLAGS: 00010292 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.023675] RAX: ffffb8ecc6ea8200 RBX: ffff9f01f639e1a0 RCX: ffff9f091f6222c0 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.031162] RDX: 000000012e52cbc0 RSI: ffffffffc07e2db0 RDI: ffffb8ecc6ea8230 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.038647] RBP: ffff9f01f639e1a0 R08: 0000000004b94b30 R09: ffff9f091f6222e8 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.046133] R10: 0000000000000000 R11: 000000000000029f R12: ffffffffc07e2db0 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.053620] R13: ffffb8ecc6ea8200 R14: ffffb8eccb5efe48 R15: ffff9f091f6222c0 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.061107] FS: 00007ff965d34a00(0000) GS:ffff9f091f600000(0000) knlGS:0000000000000000 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.069550] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.075645] CR2: ffffb8ecc6ea8230 CR3: 00000001ac8e8001 CR4: 00000000007706f0 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.083130] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.090616] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.098103] PKRU: 55555554 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.101164] Call Trace: 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.103962] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.106411] ? __die+0x23/0x70 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.109815] ? page_fault_oops+0x171/0x4e0 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.114261] ? exc_page_fault+0x175/0x180 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.118619] ? asm_exc_page_fault+0x26/0x30 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.123141] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.128729] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.134292] ? ioread32be+0x34/0x60 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.138127] mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.144812] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.150364] poll_health+0x42/0x230 [mlx5_core] 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.155310] ? __next_timer_interrupt+0x9b/0x110 2024-05-08 11:09:50.150 May 8 09:09:50 10.211.164.42 [777403.160265] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.165807] call_timer_fn+0x21/0x130 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.169798] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.175332] __run_timers+0x222/0x2c0 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.179311] run_timer_softirq+0x1d/0x40 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.183545] __do_softirq+0xc9/0x2c8 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.187435] __irq_exit_rcu+0xa6/0xc0 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.191411] sysvec_apic_timer_interrupt+0x3e/0x90 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.196510] asm_sysvec_apic_timer_interrupt+0x1a/0x20 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.201955] RIP: 0033:0x7ff968316373 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.205856] Code: 48 89 da 48 8d 42 08 48 83 fa f8 72 15 48 89 c2 48 89 de 48 8d 05 5d 40 15 00 48 89 c7 e8 75 7c f0 ff f0 48 ff 05 c5 b3 1d 00 <48> 8b 5b 08 48 89 d8 48 85 db 74 0b 48 89 c2 83 e2 07 48 85 d2 74 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.225146] RSP: 002b:00007ffdc152ac50 EFLAGS: 00000203 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.230686] RAX: 0000000001285d88 RBX: 0000000001285d80 RCX: 000020000b2ffe00 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.238132] RDX: 0000000001285d80 RSI: 000000000126e950 RDI: 00000000012942e0 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.245582] RBP: 00007ffdc152ac70 R08: 0000000000000000 R09: 000000000000000f 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.253024] R10: 0000000000000000 R11: 00000000018f2000 R12: 0000000000000007 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.260480] R13: 0000000000000000 R14: 00007ff968f2c000 R15: 0000000000435720 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.267918] 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.270416] Modules linked in: nvme_rdma nvme_fabrics vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd rdma_ucm rdma_cm iw_cm ib_umad ib_cm rfkill usdm_drv(OE) mlx5_ib sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common skx_edac nfit binfmt_misc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ib_uverbs macsec qat_c62x(OE) kvm intel_qat(OE) irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 spi_nor rapl iTCO_wdt intel_pmc_bxt mtd iTCO_vendor_support ipmi_ssif intel_cstate acpi_ipmi dax_pmem pcspkr intel_uncore spi_intel_pci ipmi_si ib_core uio ast mei_me i2c_i801 mei ipmi_devintf switchtec i2c_algo_bit lpc_ich spi_intel i2c_smbus intel_pch_thermal ioatdma wmi ipmi_msghandler joydev acpi_pad acpi_power_meter ip6_tables ip_tables fuse zram bpf_preload loop overlay squashfs netconsole nd_pmem nd_btt nd_e820 libnvdimm virtio_blk 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 virtio_net net_failover failover nvme 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.270477] nvme_core nvme_auth ixgbe mdio dca i40e mlx5_core mlxfw psample tls pci_hyperv_intf [last unloaded: nvme_fabrics] 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.374535] CR2: ffffb8ecc6ea8230 2024-05-08 11:09:50.401 May 8 09:09:50 10.211.164.42 [777403.378185] ---[ end trace 0000000000000000 ]--- 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.510500] RIP: 0010:ioread32be+0x34/0x60 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.514985] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac bc 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.534327] RSP: 0000:ffffb8eccb5efda0 EFLAGS: 00010292 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.539895] RAX: ffffb8ecc6ea8200 RBX: ffff9f01f639e1a0 RCX: ffff9f091f6222c0 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.547382] RDX: 000000012e52cbc0 RSI: ffffffffc07e2db0 RDI: ffffb8ecc6ea8230 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.554872] RBP: ffff9f01f639e1a0 R08: 0000000004b94b30 R09: ffff9f091f6222e8 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.562361] R10: 0000000000000000 R11: 000000000000029f R12: ffffffffc07e2db0 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.569847] R13: ffffb8ecc6ea8200 R14: ffffb8eccb5efe48 R15: ffff9f091f6222c0 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.577338] FS: 00007ff965d34a00(0000) GS:ffff9f091f600000(0000) knlGS:0000000000000000 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.585790] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.591898] CR2: ffffb8ecc6ea8230 CR3: 00000001ac8e8001 CR4: 00000000007706f0 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.599413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.606911] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.614407] PKRU: 55555554 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.617481] Kernel panic - not syncing: Fatal exception in interrupt 2024-05-08 11:09:50.651 May 8 09:09:50 10.211.164.42 [777403.624290] Kernel Offset: 0x3a000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) 2024-05-08 11:09:50.901 May 8 09:09:50 10.211.164.42 [777403.781675] Rebooting in 5 seconds.. # WFP29 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.364514] BUG: unable to handle page fault for address: ffff9a8706ae4230 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.371751] #PF: supervisor read access in kernel mode 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.377222] #PF: error_code(0x0000) - not-present page 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.382670] PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105af4067 PTE 0 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.389774] Oops: 0000 [#1] PREEMPT SMP PTI 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.394245] CPU: 0 PID: 3465507 Comm: reactor_0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.404806] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.415532] RIP: 0010:ioread32be+0x34/0x60 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.419927] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac ba 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.439298] RSP: 0000:ffff9a8734553da0 EFLAGS: 00010292 2024-05-09 11:11:34.559 May 9 09:11:34 10.211.164.58 [76328.444848] RAX: ffff9a8706ae4200 RBX: ffff8c0ccf20a1a0 RCX: ffff8c131f8222c0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.452289] RDX: 0000000104881ac0 RSI: ffffffffc0500db0 RDI: ffff9a8706ae4230 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.459732] RBP: ffff8c0ccf20a1a0 R08: 000000000412206c R09: ffff8c131f8222e8 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.467180] R10: 0000000000000004 R11: 0000000000000345 R12: ffffffffc0500db0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.474624] R13: ffff9a8706ae4200 R14: ffff9a8734553e48 R15: ffff8c131f8222c0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.482068] FS: 00007f9e64245a00(0000) GS:ffff8c131f800000(0000) knlGS:0000000000000000 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.490468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.496537] CR2: ffff9a8706ae4230 CR3: 0000000895142002 CR4: 00000000007706f0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.503990] PKRU: 55555554 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.507006] Call Trace: 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.509769] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.512164] ? __die+0x23/0x70 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.515514] ? page_fault_oops+0x171/0x4e0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.519908] ? exc_page_fault+0x175/0x180 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.524211] ? asm_exc_page_fault+0x26/0x30 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.528679] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.534232] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.539756] ? ioread32be+0x34/0x60 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.543527] mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.550159] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.555649] poll_health+0x42/0x230 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.560528] ? __next_timer_interrupt+0x9b/0x110 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.565420] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.570908] call_timer_fn+0x21/0x130 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.574835] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.580313] __run_timers+0x222/0x2c0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.584227] run_timer_softirq+0x1d/0x40 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.588401] __do_softirq+0xc9/0x2c8 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.592229] __irq_exit_rcu+0xa6/0xc0 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.596156] sysvec_apic_timer_interrupt+0x3e/0x90 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.601197] asm_sysvec_apic_timer_interrupt+0x1a/0x20 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.606574] RIP: 0033:0x7f9e65ed9cb2 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.610417] Code: c7 e8 fe d7 ff ff 89 45 d4 f0 48 ff 05 57 3b 05 00 8b 45 d4 3b 45 c8 0f 8e c8 fe ff ff e9 b5 fe ff ff f0 48 ff 05 56 3b 05 00 <48> 8b 45 b8 48 83 7d b8 00 74 0b 48 89 c2 83 e2 07 48 85 d2 74 12 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.629675] RSP: 002b:00007ffed20a3b70 EFLAGS: 00000212 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.635160] RAX: 0000000000000000 RBX: 00000000005e1608 RCX: 00002000008ff200 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.642553] RDX: 0000000000000000 RSI: 00007ffed20a3b10 RDI: 00002000008ff200 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.649945] RBP: 00007ffed20a3bd0 R08: 0000000000000000 R09: 000000000000000f 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.657334] R10: 0000000000000000 R11: 0000000001071000 R12: 0000000000000007 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.664726] R13: 0000000000000000 R14: 00007f9e6744a000 R15: 0000000000435720 2024-05-09 11:11:34.810 May 9 09:11:34 10.211.164.58 [76328.672120] 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.674564] Modules linked in: nvme_rdma nvme_fabrics mlx5_ib nvme_keyring iptable_filter bridge stp llc veth xfs rdma_ucm rdma_cm iw_cm ib_umad ib_cm vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd nbd usdm_drv(OE) intel_qat(OE) rfkill uio intel_rapl_msr sunrpc intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp binfmt_misc kvm_intel ib_uverbs macsec kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 ipmi_ssif sha256_ssse3 sha1_ssse3 rapl intel_cstate spi_nor mtd iTCO_wdt intel_pmc_bxt iTCO_vendor_support acpi_ipmi mei_me spi_intel_pci ib_core intel_uncore ast i2c_i801 ipmi_si pcspkr mei ioatdma spi_intel i2c_algo_bit i2c_smbus lpc_ich intel_pch_thermal ipmi_devintf wmi dca ipmi_msghandler joydev acpi_pad acpi_power_meter ip6_tables ip_tables fuse zram bpf_preload loop overlay squashfs netconsole nd_pmem nd_btt nd 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 _e820 libnvdimm virtio_blk 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.674641] virtio_net net_failover failover nvme nvme_core nvme_auth i40e mlx5_core mlxfw psample tls pci_hyperv_intf [last unloaded: nvme_fabrics] 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.780005] CR2: ffff9a8706ae4230 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.783587] ---[ end trace 0000000000000000 ]--- 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.861327] pstore: backend (erst) writing error (-22) 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.866781] RIP: 0010:ioread32be+0x34/0x60 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.871152] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac ba 2024-05-09 11:11:35.061 May 9 09:11:34 10.211.164.58 [76328.890457] RSP: 0000:ffff9a8734553da0 EFLAGS: 00010292 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.895977] RAX: ffff9a8706ae4200 RBX: ffff8c0ccf20a1a0 RCX: ffff8c131f8222c0 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.903396] RDX: 0000000104881ac0 RSI: ffffffffc0500db0 RDI: ffff9a8706ae4230 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.910819] RBP: ffff8c0ccf20a1a0 R08: 000000000412206c R09: ffff8c131f8222e8 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.918248] R10: 0000000000000004 R11: 0000000000000345 R12: ffffffffc0500db0 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.925677] R13: ffff9a8706ae4200 R14: ffff9a8734553e48 R15: ffff8c131f8222c0 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.933110] FS: 00007f9e64245a00(0000) GS:ffff8c131f800000(0000) knlGS:0000000000000000 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.941500] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.947545] CR2: ffff9a8706ae4230 CR3: 0000000895142002 CR4: 00000000007706f0 2024-05-09 11:11:35.061 May 9 09:11:35 10.211.164.58 [76328.954986] PKRU: 55555554 2024-05-09 11:11:35.312 May 9 09:11:35 10.211.164.58 [76328.957997] Kernel panic - not syncing: Fatal exception in interrupt 2024-05-09 11:11:35.312 May 9 09:11:35 10.211.164.58 [76328.964746] Kernel Offset: 0x38000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) 2024-05-09 11:11:35.312 May 9 09:11:35 10.211.164.58 [76328.979761] Rebooting in 5 seconds.. # WFP47 2024-05-09 17:33:54.338 May 9 15:33:54 10.211.164.76 [197305.384176] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-09 17:33:54.588 May 9 15:33:54 10.211.164.76 [197305.409837] mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-09 17:33:56.343 May 9 15:33:56 10.211.164.76 [197307.336879] mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-09 17:33:57.346 May 9 15:33:57 10.211.164.76 [197308.233318] mlx5_core 0000:18:00.0: E-Switch: cleanup 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.259435] mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.273047] mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.288331] BUG: unable to handle page fault for address: ffffa26487064230 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.295828] #PF: supervisor read access in kernel mode 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.301421] #PF: error_code(0x0000) - not-present page 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.306976] PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.314170] Oops: 0000 [#1] PREEMPT SMP PTI 2024-05-09 17:34:58.271 May 9 15:34:58 10.211.164.76 [197369.318759] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.328905] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.339745] RIP: 0010:ioread32be+0x34/0x60 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.344266] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac 87 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.363768] RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.369415] RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.376972] RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.384525] RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.392080] R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.399636] R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.407191] FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.415696] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.421869] CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.429428] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.436983] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.444540] PKRU: 55555554 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.447673] Call Trace: 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.450538] 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.452972] ? __die+0x23/0x70 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.456459] ? page_fault_oops+0x171/0x4e0 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.460977] ? exc_page_fault+0x175/0x180 2024-05-09 17:34:58.521 May 9 15:34:58 10.211.164.76 [197369.465406] ? asm_exc_page_fault+0x26/0x30 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.470006] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.475677] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.481329] ? ioread32be+0x34/0x60 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.485211] mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.491971] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.497576] poll_health+0x42/0x230 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.502568] ? __next_timer_interrupt+0xbc/0x110 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.507565] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.513150] call_timer_fn+0x21/0x130 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.517182] ? __pfx_poll_health+0x10/0x10 [mlx5_core] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.522767] __run_timers+0x222/0x2c0 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.526795] run_timer_softirq+0x1d/0x40 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.531069] __do_softirq+0xc9/0x2c8 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.535016] __irq_exit_rcu+0xa6/0xc0 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.539025] sysvec_apic_timer_interrupt+0x72/0x90 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.544168] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.546613] 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.549051] asm_sysvec_apic_timer_interrupt+0x1a/0x20 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.554525] RIP: 0010:cpuidle_enter_state+0xcc/0x440 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.559818] Code: ea 22 19 ff e8 c5 f3 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 33 26 18 ff 45 84 ff 0f 85 56 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d 2024-05-09 17:34:58.522 May 9 15:34:58 10.211.164.76 [197369.579145] RSP: 0018:ffffffff88403e28 EFLAGS: 00000246 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.584709] RAX: ffff904c10834180 RBX: ffff904c1083f498 RCX: 000000000000001f 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.592181] RDX: 0000000000000000 RSI: 0000000037c86f51 RDI: 0000000000000000 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.599652] RBP: 0000000000000002 R08: 0000000000000002 R09: 0000000000000064 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.607119] R10: 0000000000000018 R11: ffff904c10832ae4 R12: ffffffff8863e2c0 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.614585] R13: 0000b3819e51f31b R14: 0000000000000002 R15: 0000000000000000 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.622049] ? cpuidle_enter_state+0xbd/0x440 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.626736] cpuidle_enter+0x2d/0x40 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.630627] do_idle+0x20d/0x270 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.634182] cpu_startup_entry+0x2a/0x30 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.638417] rest_init+0xd0/0xd0 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.641953] arch_call_rest_init+0xe/0x30 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.646278] start_kernel+0x709/0xa90 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.650242] x86_64_start_reservations+0x18/0x30 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.655163] x86_64_start_kernel+0x96/0xa0 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.659561] secondary_startup_64_no_verify+0x18f/0x19b 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.665081] 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.667563] Modules linked in: nvme_rdma nvme_fabrics mlx5_ib nvme_keyring xfs rdma_ucm rdma_cm iw_cm ib_umad ib_cm vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd nbd usdm_drv(OE) intel_qat(OE) rfkill uio sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ib_uverbs macsec coretemp kvm_intel binfmt_misc kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 spi_nor ipmi_ssif iTCO_wdt rapl mei_me intel_cstate intel_pmc_bxt iTCO_vendor_support mtd acpi_ipmi ipmi_si i2c_i801 ast pcspkr spi_intel_pci ib_core dax_pmem intel_uncore mei switchtec ioatdma i2c_smbus lpc_ich intel_pch_thermal spi_intel i2c_algo_bit ipmi_devintf wmi dca ipmi_msghandler joydev acpi_power_meter acpi_pad ip6_tables ip_tables fuse zram bpf_preload loop overlay squashfs netconsole nd_pmem nd_btt nd_e820 libnvdimm 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 virtio_blk virtio_net net_failover 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.667636] failover nvme nvme_core nvme_auth i40e mlx5_core mlxfw psample tls pci_hyperv_intf [last unloaded: nvme_fabrics] 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.771564] CR2: ffffa26487064230 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.775245] ---[ end trace 0000000000000000 ]--- 2024-05-09 17:34:58.772 May 9 15:34:58 10.211.164.76 [197369.829327] RIP: 0010:ioread32be+0x34/0x60 2024-05-09 17:34:59.022 May 9 15:34:58 10.211.164.76 [197369.833855] Code: 00 77 27 48 81 ff 00 00 01 00 76 0a 89 fa ed 0f c8 c3 cc cc cc cc 8b 05 3a 49 e0 01 85 c0 75 13 b8 ff ff ff ff c3 cc cc cc cc <8b> 07 0f c8 c3 cc cc cc cc 83 e8 01 48 89 fe 48 c7 c2 94 6e ac 87 2024-05-09 17:34:59.022 May 9 15:34:58 10.211.164.76 [197369.853260] RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.858869] RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.866387] RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.873901] RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.881413] R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.888931] R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.896454] FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.904936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.911073] CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.918597] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.926132] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.933663] PKRU: 55555554 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.936776] Kernel panic - not syncing: Fatal exception in interrupt 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197369.943597] Kernel Offset: 0x5000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) 2024-05-09 17:34:59.023 May 9 15:34:58 10.211.164.76 [197370.004783] Rebooting in 5 seconds.. # GP20 2024-05-10 08:31:00.823 May 10 06:31:00 10.211.164.207 [ 1327.764608] mlx5_core 0000:81:00.1: firmware version: 14.25.1020 2024-05-10 08:31:00.823 May 10 06:31:00 10.211.164.207 [ 1327.771781] mlx5_core 0000:81:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) 2024-05-10 08:31:01.074 May 10 06:31:00 10.211.164.207 [ 1327.996739] mlx5_core 0000:81:00.1: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384) 2024-05-10 08:31:01.074 May 10 06:31:00 10.211.164.207 [ 1328.033868] mlx5_core 0000:81:00.1: Port module event: module 1, Cable unplugged 2024-05-10 08:31:01.325 May 10 06:31:01 10.211.164.207 [ 1328.235033] mlx5_core 0000:81:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic) 2024-05-10 08:31:01.325 May 10 06:31:01 10.211.164.207 [ 1328.249482] mlx5_core 0000:81:00.1 mlx_0_1: renamed from eth6 2024-05-10 08:31:01.575 May 10 06:31:01 10.211.164.207 [ 1328.561080] mlx5_core 0000:81:00.1 mlx_0_1: Link down 2024-05-10 08:31:01.575 May 10 06:31:01 10.211.164.207 [ 1328.619586] mlx5_core 0000:81:00.1: is_dpll_supported:213:(pid 188689): Missing SyncE capability 2024-05-10 08:32:26.056 May 10 06:32:25 10.211.164.207 [ 1413.141628] mlx5_core 0000:81:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-10 08:32:26.056 May 10 06:32:26 10.211.164.207 [ 1413.171511] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) 2024-05-10 08:32:26.056 May 10 06:32:26 10.211.164.207 [ 1413.205775] ------------[ cut here ]------------ 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.211390] UBSAN: array-index-out-of-bounds in kernel/locking/qspinlock.c:131:9 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.220036] index 15157 is out of range for type 'long unsigned int [8192]' 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.228192] CPU: 12 PID: 159198 Comm: kworker/u85:1 Tainted: G OE 6.8.8-100.fc38.x86_64 #1 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.239264] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.251111] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.258159] Call Trace: 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.261277] 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.264004] dump_stack_lvl+0x64/0x80 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.268482] __ubsan_handle_out_of_bounds+0x95/0xd0 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.274313] native_queued_spin_lock_slowpath+0x2cb/0x2d0 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.280717] _raw_spin_lock_irqsave+0x3d/0x50 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.285956] mlx5_ib_poll_cq+0x50/0xe20 [mlx5_ib] 2024-05-10 08:32:26.306 May 10 06:32:26 10.211.164.207 [ 1413.291604] ? finish_task_switch.isra.0+0x94/0x2f0 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.297418] __ib_process_cq+0x4f/0x180 [ib_core] 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.303068] ib_cq_poll_work+0x2a/0x80 [ib_core] 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.308608] process_one_work+0x176/0x340 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.313439] worker_thread+0x27b/0x3a0 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.317970] ? __pfx_worker_thread+0x10/0x10 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.323080] kthread+0xe8/0x120 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.326926] ? __pfx_kthread+0x10/0x10 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.331450] ret_from_fork+0x34/0x50 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.335780] ? __pfx_kthread+0x10/0x10 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.340294] ret_from_fork_asm+0x1b/0x30 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.345002] 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.347764] ---[ end trace ]--- 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.351595] BUG: unable to handle page fault for address: 00000000009e64e9 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.359598] #PF: supervisor write access in kernel mode 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.365757] #PF: error_code(0x0002) - not-present page 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.371818] PGD 0 P4D 0 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.374969] Oops: 0002 [#1] PREEMPT SMP PTI 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.379961] CPU: 12 PID: 159198 Comm: kworker/u85:1 Tainted: G OE 6.8.8-100.fc38.x86_64 #1 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.390975] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.402771] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.409743] RIP: 0010:native_queued_spin_lock_slowpath+0x27f/0x2d0 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.416983] Code: 41 89 d6 44 0f b7 e8 41 83 ee 01 49 c1 e5 05 4d 63 f6 49 81 c5 00 56 03 00 49 81 fe 00 20 00 00 73 45 4e 03 2c f5 a0 2c c0 a5 <49> 89 6d 00 8b 45 08 85 c0 75 09 f3 90 8b 45 08 85 c0 74 f7 48 8b 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.438644] RSP: 0018:ffffb96d48e07d48 EFLAGS: 00010002 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.444826] RAX: 0000000000000001 RBX: ffff97eeaf26fa00 RCX: 00000000ffffffff 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.453143] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa5ada5ea 2024-05-10 08:32:26.307 May 10 06:32:26 10.211.164.207 [ 1413.461456] RBP: ffff97f27c0b5600 R08: 0000000000000000 R09: ffffb96d48303020 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.469765] R10: ffff97eb469dff50 R11: ffff97eb469dff50 R12: 0000000000340000 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.478074] R13: 00000000009e64e9 R14: 0000000000003b35 R15: 0000000000000010 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.486382] FS: 0000000000000000(0000) GS:ffff97f27c080000(0000) knlGS:0000000000000000 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.495765] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.502527] CR2: 00000000009e64e9 CR3: 00000001e2428003 CR4: 00000000001706f0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.510840] Call Trace: 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.513904] 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.516579] ? __die+0x23/0x70 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.520321] ? page_fault_oops+0x171/0x4f0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.525224] ? exc_page_fault+0x7f/0x180 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.529925] ? asm_exc_page_fault+0x26/0x30 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.534913] ? native_queued_spin_lock_slowpath+0x27f/0x2d0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.541444] ? native_queued_spin_lock_slowpath+0x2cb/0x2d0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.547969] _raw_spin_lock_irqsave+0x3d/0x50 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.553140] mlx5_ib_poll_cq+0x50/0xe20 [mlx5_ib] 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.558715] ? finish_task_switch.isra.0+0x94/0x2f0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.564461] __ib_process_cq+0x4f/0x180 [ib_core] 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.570056] ib_cq_poll_work+0x2a/0x80 [ib_core] 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.575545] process_one_work+0x176/0x340 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.580331] worker_thread+0x27b/0x3a0 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.584821] ? __pfx_worker_thread+0x10/0x10 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.589893] kthread+0xe8/0x120 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.593702] ? __pfx_kthread+0x10/0x10 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.598188] ret_from_fork+0x34/0x50 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.602485] ? __pfx_kthread+0x10/0x10 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.606971] ret_from_fork_asm+0x1b/0x30 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.611655] 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 [ 1413.614391] Modules linked in: nvme_rdma nvme_fabrics xfs vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd nbd mlx5_ib rfkill usdm_drv(OE) rpcrdma rdma_ucm ib_srpt sunrpc ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_umad binfmt_misc rdma_cm ib_ipoib iw_cm ib_cm intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm qat_c62x(OE) intel_qat(OE) irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 iTCO_wdt intel_pmc_bxt rapl ib_uverbs intel_cstate ipmi_si macsec libsas iTCO_vendor_support mei_me i2c_i801 ipmi_devintf intel_uncore ipmi_msghandler pcspkr mgag200 mei scsi_transport_sas uio lpc_ich ioatdma i2c_smbus wmi ib_core joydev ip6_tables ip_tables fuse zram bpf_preload loop overlay squashfs netconsole nd_pmem nd_btt nd_e820 libnvdimm virtio_blk virtio_net net_failover failover nvme nvme_core nvme_auth mlx5_c 2024-05-10 08:32:26.557 May 10 06:32:26 10.211.164.207 ore mlxfw psample tls 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.614473] pci_hyperv_intf ixgbe mdio igb i2c_algo_bit dca [last unloaded: nvme_fabrics] 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.726126] CR2: 00000000009e64e9 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.730211] ---[ end trace 0000000000000000 ]--- 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.742448] pstore: backend (erst) writing error (-28) 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.748568] RIP: 0010:native_queued_spin_lock_slowpath+0x27f/0x2d0 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.755852] Code: 41 89 d6 44 0f b7 e8 41 83 ee 01 49 c1 e5 05 4d 63 f6 49 81 c5 00 56 03 00 49 81 fe 00 20 00 00 73 45 4e 03 2c f5 a0 2c c0 a5 <49> 89 6d 00 8b 45 08 85 c0 75 09 f3 90 8b 45 08 85 c0 74 f7 48 8b 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.777615] RSP: 0018:ffffb96d48e07d48 EFLAGS: 00010002 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.783818] RAX: 0000000000000001 RBX: ffff97eeaf26fa00 RCX: 00000000ffffffff 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.792158] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa5ada5ea 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.800484] RBP: ffff97f27c0b5600 R08: 0000000000000000 R09: ffffb96d48303020 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.808820] R10: ffff97eb469dff50 R11: ffff97eb469dff50 R12: 0000000000340000 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.817157] R13: 00000000009e64e9 R14: 0000000000003b35 R15: 0000000000000010 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.825521] FS: 0000000000000000(0000) GS:ffff97f27c080000(0000) knlGS:0000000000000000 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.834989] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.841801] CR2: 00000000009e64e9 CR3: 00000001e2428003 CR4: 00000000001706f0 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.850329] Kernel panic - not syncing: Fatal exception 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.856566] Kernel Offset: 0x23000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) 2024-05-10 08:32:26.807 May 10 06:32:26 10.211.164.207 [ 1413.873429] Rebooting in 5 seconds..