[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e27bd484-f79b-4bfb-95bd-6f24518d1cbe@intel.com>
Date: Wed, 31 Jan 2024 19:33:35 +0100
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Alan Brady <alan.brady@...el.com>
CC: <intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>,
<willemdebruijn.kernel@...il.com>, <igor.bagnucki@...el.com>,
<przemyslaw.kitszel@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH v3 0/7 iwl-next] idpf: refactor virtchnl
messages
From: Alan Brady <alan.brady@...el.com>
Date: Mon, 29 Jan 2024 16:59:16 -0800
> The motivation for this series has two primary goals. We want to enable
> support of multiple simultaneous messages and make the channel more
> robust. The way it works right now, the driver can only send and receive
> a single message at a time and if something goes really wrong, it can
> lead to data corruption and strange bugs.
Have you tested v3?
I have this on my system (net-next + your series), no other patches applied:
> [alobakin@...153-KR1-CYP-38282-U39-ETH1 ~]$ sudo modprobe idpf
> [ 89.785966] idpf 0000:83:00.0: Device HW Reset initiated
> [alobakin@...153-KR1-CYP-38282-U39-ETH1 ~]$ [ 90.241658] BUG: unable to handle page fault for address: ff8e1df482000000
> [ 90.241704] #PF: supervisor write access in kernel mode
> [ 90.241728] #PF: error_code(0x0002) - not-present page
> [ 90.241751] PGD 107ffc8067 P4D 107ffc7067 PUD 207fdc8067 PMD 0
> [ 90.241782] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [ 90.241805] CPU: 32 PID: 847 Comm: kworker/32:1 Kdump: loaded Not tainted 6.8.0-rc1-libeth+ #1
> [ 90.241841] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0008.2305172341 05/17/2023
> [ 90.241879] Workqueue: idpf-0000:83:00.0-vc_ev idpf_vc_event_task [idpf]
> [ 90.241932] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [ 90.241962] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [ 90.242027] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [ 90.242053] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 0000000000000034
> [ 90.242084] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: ff8e1df481d980c0
> [ 90.242115] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 0000000000000000
> [ 90.242145] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 0000000000000000
> [ 90.242174] R13: 0000000000267f40 R14: 0000000004000000 R15: 0000000000000000
> [ 90.242206] FS: 0000000000000000(0000) GS:ff2506253ec00000(0000) knlGS:0000000000000000
> [ 90.242240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 90.242266] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 0000000000771ef0
> [ 90.242297] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 90.242327] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 90.242357] PKRU: 55555554
> [ 90.242378] Call Trace:
> [ 90.242393] <TASK>
> [ 90.242410] ? __die_body+0x68/0xb0
> [ 90.242437] ? page_fault_oops+0x3a6/0x400
> [ 90.242467] ? exc_page_fault+0xb2/0x1b0
> [ 90.242496] ? asm_exc_page_fault+0x26/0x30
> [ 90.242527] ? __free_pages_ok+0x338/0x4f0
> [ 90.242554] idpf_mb_clean+0xc1/0x110 [idpf]
> [ 90.242600] idpf_send_mb_msg+0x50/0x1b0 [idpf]
> [ 90.242643] idpf_vc_xn_exec+0x189/0x350 [idpf]
> [ 90.242688] idpf_vc_core_init+0x32c/0x6d0 [idpf]
> [ 90.242735] idpf_vc_event_task+0x2da/0x390 [idpf]
> [ 90.242779] process_scheduled_works+0x251/0x460
> [ 90.242807] worker_thread+0x21c/0x2d0
> [ 90.242830] ? __pfx_worker_thread+0x10/0x10
> [ 90.242855] kthread+0xe8/0x110
> [ 90.242878] ? __pfx_kthread+0x10/0x10
> [ 90.242902] ret_from_fork+0x37/0x50
> [ 90.242925] ? __pfx_kthread+0x10/0x10
> [ 90.243802] ret_from_fork_asm+0x1b/0x30
> [ 90.244598] </TASK>
> [ 90.245361] Modules linked in: idpf libeth rpcrdma rdma_cm ib_cm iw_cm ib_core qrtr rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel binfmt_misc vfat fat kvm irqbypass rapl ipmi_ssif iTCO_wdt intel_cstate intel_pmc_bxt iTCO_vendor_support dax_hmem cxl_acpi nfsd ioatdma intel_uncore mei_me isst_if_mmio cxl_core i2c_i801 isst_if_mbox_pci mei acpi_ipmi pcspkr intel_vsec isst_if_common i2c_smbus joydev dca ipmi_si intel_pch_thermal ipmi_devintf auth_rpcgss ipmi_msghandler acpi_power_meter acpi_pad nfs_acl lockd sunrpc loop grace zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel nvme bnxt_en sha512_ssse3 ast nvme_core sha256_ssse3 sha1_ssse3 i2c_algo_bit wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables dm_multipath fuse
> [ 90.252381] CR2: ff8e1df482000000
> [ 90.253263] ---[ end trace 0000000000000000 ]---
> [ 90.314201] pstore: backend (erst) writing error (-28)
> [ 90.314686] RIP: 0010:__free_pages_ok+0x338/0x4f0
> [ 90.314970] Code: e6 06 45 31 e4 41 bd 40 00 00 00 45 85 ff 74 13 4b 8d 34 28 4c 89 c7 e8 36 97 00 00 8b 74 24 04 41 01 c4 66 90 4c 8b 44 24 08 <43> 81 24 28 00 00 80 c0 49 83 c5 40 4d 39 ee 75 d0 e9 7c fd ff ff
> [ 90.315511] RSP: 0018:ff3f281b098d7b78 EFLAGS: 00010246
> [ 90.315778] RAX: 0000000000100000 RBX: ff8e1df400000000 RCX: 0000000000000034
> [ 90.316043] RDX: 0000000000000d80 RSI: 0000000000000034 RDI: ff8e1df481d980c0
> [ 90.316308] RBP: 0000000000000000 R08: ff8e1df481d980c0 R09: 0000000000000000
> [ 90.316573] R10: ff25062537f9fe00 R11: 0000000000000020 R12: 0000000000000000
> [ 90.316838] R13: 0000000000267f40 R14: 0000000004000000 R15: 0000000000000000
> [ 90.317104] FS: 0000000000000000(0000) GS:ff2506253ec00000(0000) knlGS:0000000000000000
> [ 90.317368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 90.317631] CR2: ff8e1df482000000 CR3: 000000207d420006 CR4: 0000000000771ef0
> [ 90.317897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 90.318163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 90.318427] PKRU: 55555554
> [ 90.318687] note: kworker/32:1[847] exited with irqs disabled
> [ 90.319202] BUG: kernel NULL pointer dereference, address: 0000000000000008
[...]
Thanks,
Olek
Powered by blists - more mailing lists