[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <16b90f1d-69b4-72ac-7018-66d524f514f9@camlingroup.com>
Date: Mon, 12 Sep 2022 17:16:24 +0200
From: Lech Perczak <lech.perczak@...lingroup.com>
To: Jérôme Pouiller <jerome.pouiller@...abs.com>,
linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
Paweł Lenkow <pawel.lenkow@...lingroup.com>
Cc: Kalle Valo <kvalo@...nel.org>,
Krzysztof Drobiński
<krzysztof.drobinski@...lingroup.com>,
Kirill Yatsenko <kirill.yatsenko@...lingroup.com>
Subject: wfx: Memory corruption during high traffic with WFM200 on i.MX6Q
platform
Hello,
We're trying to get a WFM200S022XNN3 module working on a custom i.MX6Q board using SDIO interface, using upstream kernel. Our patches concern primarily the device tree for the board - and upstream firmware from linux-firmware repository.
During that, we stumbled upon a memory corruption issue, which appears when big traffic is passing through the device. Our adapter is running in AP mode. This can be reproduced with 100% rate using iperf3, by starting an AP interface on the device, and an iperf3 server. Then, the client station runs iperf3 with "iperf3 -c <hostname> -t 3600" command - so the AP is sending data for up to one hour, however - the kernel on our device crashes after around a few minutes of traffic, sometimes less than a minute.
The behaviour is the same on kernel v5.19.7, v5.19.2, and even with v6.0-rc5. Tests on v6.0-rc5 have shown most detailed stacktrace so far:
8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000101 [00000101] *pgd=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: xt_LOG nf_log_syslog xt_limit iptable_mangle xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables cdc_mbim cdc_wdm cdc_ncm cdc_ether usbnet cdc_acm usb_serial_simple usbserial usb_f_rndis u_ether wfx mac80211 libarc4 cfg80211 evbug phy_generic ci_hdrc_imx ci_hdrc adt7475 hwmon_vid ulpi roles usbmisc_imx pwm_imx27 pwm_beeper libcomposite configfs udc_core CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 6.0.0-rc5+g047dc4cf9a10 #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) PC is at kfree_skb_list_reason+0x10/0x24 LR is at ieee80211_report_used_skb+0xd0/0x5b4 [mac80211] pc : [<80773238>] lr : [<7f136538>] psr: 20000113 sp : f0801e60 ip : 00000000 fp : 838f04e2 r10: 00000001 r9 : 838f04e2 r8 : 00000000 r7 : 82661580
r6 : 00000000 r5 : 82660580 r4 : 00000101 r3 : 838f0700 r2 : 00000032 r1 : 00000001 r0 : 00000101 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 11d0004a DAC: 00000051 Register r0 information: non-paged memory Register r1 information: non-paged memory Register r2 information: non-paged memory Register r3 information: slab kmalloc-1k start 838f0400 pointer offset 768 size 1024 Register r4 information: non-paged memory Register r5 information: slab kmalloc-8k start 82660000 pointer offset 1408 size 8192 Register r6 information: NULL pointer Register r7 information: slab kmalloc-8k start 82660000 pointer offset 5504 size 8192 Register r8 information: NULL pointer Register r9 information: slab kmalloc-1k start 838f0400 pointer offset 226 size 1024 Register r10 information: non-paged memory Register r11 information: slab kmalloc-1k start 838f0400 pointer offset 226 size 1024 Register r12 information: NULL pointer Process
ksoftirqd/0 (pid: 10, stack limit = 0x1fff5f96) Stack: (0xf0801e60 to 0xf0802000) 1e60: 8393cd80 7f136538 00000000 81590f34 80f050b4 20000193 f0801ecc 7f189a7c 1e80: 00000032 00000005 823f0458 f0801f18 81c51a00 8368504c 7f189854 83898000 1ea0: 8226ac40 40000210 00000200 80f04ec8 f17ddddc 00000000 f0801f18 82660580 1ec0: 8393cd80 00000000 00000000 8393cd98 838f04e2 7f13791c 00000000 00000000 1ee0: 82660580 00004288 00000000 838f04e2 82660580 8393cd98 82660580 838f04e2 1f00: 82660a8c 7f1906b0 7f190708 00000000 40000006 7f137d18 8368578c 8393cd98 1f20: 8393cd80 00000000 00000000 00000000 00000000 00000000 82660a8c 80f04ec8 1f40: 8393cd80 82660580 82660a7c 7f1347f8 00000000 80f04ec8 00000001 82660a64 1f60: 00000000 eefad338 00000000 00000006 80be7f14 801246f8 00000006 80f03098 1f80: 80f03080 81504c80 00000101 8010140c f0861e78 80915818 8225e100 f0801f90 1fa0: 80f03080 80e543c0 80c059f4 0000000a 80e56a40 80e56a40 80e54334 80c284f4 1fc0: 00005a10 80f03d40 80a01e20 04208040 80c059f4
80e56a40 20000013 ffffffff 1fe0: f0861eb4 81504c80 81504c80 80f050b4 f0861e78 801245ac 80144024 804772fc kfree_skb_list_reason from ieee80211_report_used_skb+0xd0/0x5b4 [mac80211] ieee80211_report_used_skb [mac80211] from ieee80211_tx_status_ext+0x4c8/0x850 [mac80211] ieee80211_tx_status_ext [mac80211] from ieee80211_tx_status+0x74/0x9c [mac80211] ieee80211_tx_status [mac80211] from ieee80211_tasklet_handler+0xb0/0xd8 [mac80211] ieee80211_tasklet_handler [mac80211] from tasklet_action_common.constprop.0+0xb0/0xc4 tasklet_action_common.constprop.0 from __do_softirq+0x14c/0x2c0 __do_softirq from irq_exit+0x98/0xc8 irq_exit from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x98/0xc8 Exception stack(0xf0861e80 to 0xf0861ec8) 1e80: 00000001 00000002 00000001 81504c80 eefafdc0 00000000 81590880 00000000 1ea0: 81504c80 81505248 80f050b4 f0861f14 f0861f18 f0861ed0 80915bec 80144024 1ec0: 20000013 ffffffff __irq_svc from finish_task_switch+0xa8/0x270 finish_task_switch
from __schedule+0x25c/0x628 __schedule from schedule+0x5c/0xb4 schedule from smpboot_thread_fn+0xbc/0x23c smpboot_thread_fn from kthread+0xf4/0x124 kthread from ret_from_fork+0x14/0x2c Exception stack(0xf0861fb0 to 0xf0861ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 Code: e92d4010 e2504000 08bd8010 e1a00004 (e5944000) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception in interrupt CPU2: stopping CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D 6.0.0-rc5+g047dc4cf9a10 #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x40/0x4c dump_stack_lvl from do_handle_IPI+0x100/0x128 do_handle_IPI from ipi_handler+0x18/0x20 ipi_handler from handle_percpu_devid_irq+0x8c/0x138
handle_percpu_devid_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from gic_handle_irq+0x74/0x88 gic_handle_irq from generic_handle_arch_irq+0x58/0x78 generic_handle_arch_irq from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x98/0xc8 Exception stack(0xf0871f10 to 0xf0871f58) 1f00: 00000002 80bf66e8 00000001 6e16f000 1f20: 00000000 80f0a668 00000000 00000000 a05c2adc a0629de7 eefc50c8 0000007b 1f40: fffffff5 f0871f60 80155d84 807006d8 60030013 ffffffff __irq_svc from cpuidle_enter_state+0x158/0x358 cpuidle_enter_state from cpuidle_enter+0x40/0x50 cpuidle_enter from do_idle+0x19c/0x208 do_idle from cpu_startup_entry+0x18/0x1c cpu_startup_entry from secondary_start_kernel+0x148/0x150 secondary_start_kernel from 0x10101620 CPU3: stopping CPU: 3 PID: 0 Comm: swapper/3 Tainted: G D 6.0.0-rc5+g047dc4cf9a10 #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) unwind_backtrace from
show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x40/0x4c dump_stack_lvl from do_handle_IPI+0x100/0x128 do_handle_IPI from ipi_handler+0x18/0x20 ipi_handler from handle_percpu_devid_irq+0x8c/0x138 handle_percpu_devid_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from gic_handle_irq+0x74/0x88 gic_handle_irq from generic_handle_arch_irq+0x58/0x78 generic_handle_arch_irq from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x98/0xc8 Exception stack(0xf0875f10 to 0xf0875f58) 5f00: 00000003 80bf66e8 00000001 6e17a000 5f20: 00000000 80f0a668 00000000 00000000 a05c5ef1 a0629de7 eefd00c8 0000007b 5f40: fffffff5 f0875f60 80155d84 807006d8 60000013 ffffffff __irq_svc from cpuidle_enter_state+0x158/0x358 cpuidle_enter_state from cpuidle_enter+0x40/0x50 cpuidle_enter from do_idle+0x19c/0x208 do_idle from cpu_startup_entry+0x18/0x1c cpu_startup_entry from secondary_start_kernel+0x148/0x150 secondary_start_kernel
from 0x10101620 CPU1: stopping CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 6.0.0-rc5+g047dc4cf9a10 #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x40/0x4c dump_stack_lvl from do_handle_IPI+0x100/0x128 do_handle_IPI from ipi_handler+0x18/0x20 ipi_handler from handle_percpu_devid_irq+0x8c/0x138 handle_percpu_devid_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from gic_handle_irq+0x74/0x88 gic_handle_irq from generic_handle_arch_irq+0x58/0x78 generic_handle_arch_irq from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x98/0xc8 Exception stack(0xf086df10 to 0xf086df58) df00: 00000001 80bf66e8 00000001 6e164000 df20: 00000000 80f0a668 00000000 00000000 a05c2d77 a0629de7 eefba0c8 0000007b df40: fffffff5 f086df60 80155d84 807006d8 600e0013 ffffffff __irq_svc from cpuidle_enter_state+0x158/0x358
cpuidle_enter_state from cpuidle_enter+0x40/0x50 cpuidle_enter from do_idle+0x19c/0x208 do_idle from cpu_startup_entry+0x18/0x1c cpu_startup_entry from secondary_start_kernel+0x148/0x150 secondary_start_kernel from 0x10101620
However, the corruption can manifest itself in different ways as well - sometimes even damaging contents of onboard NAND flash. Similar traces have appeared previously in other places as well. In addition to testing on 6.0-rc5, we tried cherry-picking 047dc4cf9a10b4f2dc164b8bf192de583f3ebfee from wireless-next as well, but this seems unrelated to the issue on first glance, and doesn't prevent crashes. I post relevant bits of device tree we used to get the module to work below. We're using in-band IRQ of the SDIO interface:
/ { wfx_pwrseq: wfx_pwrseq { compatible = "mmc-pwrseq-simple"; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_wfx_reset>; reset-gpios = <&gpio7 8 GPIO_ACTIVE_LOW>; }; };
&iomuxc { usdhc1 { pinctrl_usdhc1_3: usdhc1grp-3 { fsl,pins = < MX6QDL_PAD_SD1_CMD__SD1_CMD 0x17059 MX6QDL_PAD_SD1_CLK__SD1_CLK 0x10059 MX6QDL_PAD_SD1_DAT0__SD1_DATA0 0x17059 MX6QDL_PAD_SD1_DAT1__SD1_DATA1 0x17059 MX6QDL_PAD_SD1_DAT2__SD1_DATA2 0x17059 MX6QDL_PAD_SD1_DAT3__SD1_DATA3 0x17059 MX6QDL_PAD_SD3_CLK__GPIO7_IO03 0x17041 MX6QDL_PAD_SD3_CMD__GPIO7_IO02 0x13019 >; }; pinctrl_wfx_reset: wfx-reset-grp { fsl,pins = < MX6QDL_PAD_SD3_RST__GPIO7_IO08 0x1B030 >; }; }; }; &usdhc1 { status = "okay";
#address-cells = <1>; #size-cells = <0>; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usdhc1_3>; cap-power-off-card; keep-power-in-suspend; cap-sdio-irq; wakeup-source; disable-wp; cap-sd-highspeed; bus-width = <4>; non-removable; no-mmc; no-sd; mmc-pwrseq = <&wfx_pwrseq>; wifi@1 { compatible = "silabs,brd8023a"; reg = <1>; wakeup-gpios = <&gpio7 2 GPIO_ACTIVE_HIGH>; }; };
With that, the device probes successfully, and we can get 22Mbps of traffic with a 1T1R peer in HT20 mode in both directions. SDIO singals were checked with the oscilloscope, and they look perfectly fine, so I think we can rule out any hardware issue.
By adding a canary to slab allocator, we managed to find, that the skb structures gets damaged and then improperly dereferenced by the driver somewhere in TX queue handling code.With SMP disabled, the issue still manifests itself, hinting at synchronization issue between the interrupt context, and the tasklets handling the bulk of work. In some cases the kernel would detect use-after-free by itself - without modification, or the reference counts get corrupted. This stacktrace comes from one of the runs with CONFIG_SMP disabled: 8<------------[ cut here ]------------ WARNING: CPU: 0 PID: 10 at lib/refcount.c:28 ieee80211_tx_status_ext+0x4f8/0x968 [mac80211] refcount_t: underflow; use-after-free. Modules linked in: xt_LOG nf_log_syslog xt_limit iptable_mangle xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables cdc_mbim cdc_ wdm cdc_ncm cdc_ether usbnet cdc_acm usb_serial_simple usbserial usb_f_rndis u_ether wfx
mac80211 libarc4 evbug phy_generic cfg80211 adt7475 hwmon_vid ci_hdrc_imx ci_hdrc ulpi roles usbmisc _imx pwm_imx27 pwm_beeper libcomposite configfs udc_core CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G W 5.19.2+ge4fb6643395f #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x24/0x2c dump_stack_lvl from __warn+0xb0/0xd8 __warn from warn_slowpath_fmt+0x98/0xc8 warn_slowpath_fmt from ieee80211_tx_status_ext+0x4f8/0x968 [mac80211] ieee80211_tx_status_ext [mac80211] from ieee80211_tx_status+0x74/0x9c [mac80211] ieee80211_tx_status [mac80211] from ieee80211_tasklet_handler+0xb0/0xd8 [mac80211] ieee80211_tasklet_handler [mac80211] from tasklet_action_common.constprop.0+0xb4/0xc0 tasklet_action_common.constprop.0 from __do_softirq+0x12c/0x290 __do_softirq from irq_exit+0x90/0xbc irq_exit from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x94/0xc4 Exception
stack(0xf0859e98 to 0xf0859ee0) 9e80: 00000001 81080780 9ea0: 00000001 81080780 00000000 00000002 822f0780 808e82cc 81080780 81080c50 9ec0: 00000000 f0859f14 f0859f18 f0859ee8 801404f0 80140624 20000013 ffffffff __irq_svc from finish_task_switch+0x78/0x1f8 finish_task_switch from __schedule+0x244/0x580 __schedule from schedule+0x5c/0xb4 schedule from smpboot_thread_fn+0xb8/0x224 smpboot_thread_fn from kthread+0xe4/0x114 kthread from ret_from_fork+0x14/0x2c Exception stack(0xf0859fb0 to 0xf0859ff8) 9fa0: 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1131 at lib/refcount.c:22 __tcp_transmit_skb+0x7a4/0xa8c refcount_t: saturated; leaking memory. Modules linked in: xt_LOG
nf_log_syslog xt_limit iptable_mangle xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables cdc_mbim cdc_ wdm cdc_ncm cdc_ether usbnet cdc_acm usb_serial_simple usbserial usb_f_rndis u_ether wfx mac80211 libarc4 evbug phy_generic cfg80211 adt7475 hwmon_vid ci_hdrc_imx ci_hdrc ulpi roles usbmisc _imx pwm_imx27 pwm_beeper libcomposite configfs udc_core CPU: 0 PID: 1131 Comm: kworker/0:2H Tainted: G W 5.19.2+ge4fb6643395f #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Workqueue: wfx_bh_wq bh_work [wfx] unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x24/0x2c dump_stack_lvl from __warn+0xb0/0xd8 __warn from warn_slowpath_fmt+0x98/0xc8 warn_slowpath_fmt from __tcp_transmit_skb+0x7a4/0xa8c __tcp_transmit_skb from __tcp_send_ack.part.0+0xd0/0x13c __tcp_send_ack.part.0 from tcp_delack_timer_handler+0xb0/0x180 tcp_delack_timer_handler from tcp_delack_timer+0x2c/0x128
tcp_delack_timer from call_timer_fn.constprop.0+0x18/0x80 call_timer_fn.constprop.0 from run_timer_softirq+0x2ec/0x3b0 run_timer_softirq from __do_softirq+0x12c/0x290 __do_softirq from call_with_stack+0x18/0x20 call_with_stack from do_softirq+0x6c/0x70 do_softirq from __local_bh_enable_ip+0xd8/0xdc __local_bh_enable_ip from __netdev_alloc_skb+0x14c/0x170 __netdev_alloc_skb from bh_work+0x1b0/0x650 [wfx] bh_work [wfx] from process_one_work+0x1b8/0x3ec process_one_work from worker_thread+0x4c/0x57c worker_thread from kthread+0xe4/0x114 kthread from ret_from_fork+0x14/0x2c Exception stack(0xf161dfb0 to 0xf161dff8) dfa0: 00000000 00000000 00000000 00000000 dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 0000000000000000 ]--- [ 5] 536.16-537.00 sec 26.9 KBytes 261 Kbits/sec [ 5] 537.00-538.00 sec 2.71 MBytes 22.7
Mbits/sec 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 0000011c [0000011c] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT ARM Modules linked in: xt_LOG nf_log_syslog xt_limit iptable_mangle xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables cdc_mbim cdc_ wdm cdc_ncm cdc_ether usbnet cdc_acm usb_serial_simple usbserial usb_f_rndis u_ether wfx mac80211 libarc4 evbug phy_generic cfg80211 adt7475 hwmon_vid ci_hdrc_imx ci_hdrc ulpi roles usbmisc _imx pwm_imx27 pwm_beeper libcomposite configfs udc_core CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G W 5.19.2+ge4fb6643395f #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) PC is at ip6_rcv_core+0x110/0x68c LR is at ip6_rcv_core+0xb0/0x68c pc : [<8084d278>] lr : [<8084d218>] psr: 20000013 sp : f0859e18 ip : 00000000 fp : 80e13cc0 r10: 00000000 r9 : 80e13cf4 r8 : 81b65000 r7 :
80e6d7c8 r6 : 82024c00 r5 : 812a8760 r4 : 81be5b40 r3 : 00000000 r2 : 00000100 r1 : 000000d7 r0 : 00000000 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c53c7d Table: 12338059 DAC: 00000051 Register r0 information: NULL pointer Register r1 information: non-paged memory Register r2 information: non-paged memory Register r3 information: NULL pointer Register r4 information: slab skbuff_head_cache start 81be5b40 pointer offset 0 size 48 Register r5 information: non-slab/vmalloc memory Register r6 information: slab kmalloc-1k start 82024c00 pointer offset 0 size 1024 Register r7 information: non-slab/vmalloc memory Register r8 information: slab kmalloc-2k start 81b65000 pointer offset 0 size 2048 Register r9 information: non-slab/vmalloc memory Register r10 information: NULL pointer Register r11 information: non-slab/vmalloc memory Register r12 information: NULL pointer Process ksoftirqd/0 (pid: 10, stack limit = 0x7cac7060) Stack:
(0xf0859e18 to 0xf085a000) 9e00: 81b65000 80e13d00 9e20: 80e6d7c8 80e13cc8 00000040 80e13cf4 00000000 8084da90 80d0ce80 80d0424c 9e40: 80d0ce80 81b65000 80e13d00 00000001 80e13cc8 80d0424c 8084da60 80e13d00 9e60: 00000001 807691c0 00000001 81be5b40 80d06654 80d0424c 81be5b40 80769348 9e80: 00000001 80e13d00 00000040 f0859ecb 80dd6000 00008b6a f0859ed4 80769ec4 9ea0: 00000001 81080780 00000000 80e13d00 0000012c 00000000 f0859ecc 8076a2d8 9ec0: 00008b6c 81080780 00859f18 f0859ecc f0859ecc f0859ed4 f0859ed4 80d0424c 9ee0: 00000051 00000000 00000003 80e15834 80e15828 81080780 00000100 80adb4e4 9f00: 40000003 801013f4 821d9540 00000000 f0859f5c 80e15828 80d0d390 80e13c80 9f20: 80af6e3c 0000000a 80d0b588 80b19518 00008b6b 80dd6000 04208040 80901dd0 9f40: 81080780 00000000 8102de00 81080780 80d0b558 00000001 00000001 00000000 9f60: 00000000 80120a18 00000000 8013e590 8102de40 8102df00 8013e42c 8102de00 9f80: 81080780 f0835e30
00000000 8013a85c 8102de40 8013a778 00000000 00000000 9fa0: 00000000 00000000 00000000 80100148 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 ip6_rcv_core from ipv6_rcv+0x30/0xd4 ipv6_rcv from __netif_receive_skb_one_core+0x5c/0x80 __netif_receive_skb_one_core from process_backlog+0x70/0xe4 process_backlog from __napi_poll+0x2c/0x1f0 __napi_poll from net_rx_action+0x140/0x264 net_rx_action from __do_softirq+0x12c/0x290 __do_softirq from run_ksoftirqd+0x34/0x3c run_ksoftirqd from smpboot_thread_fn+0x164/0x224 smpboot_thread_fn from kthread+0xe4/0x114 kthread from ret_from_fork+0x14/0x2c Exception stack(0xf0859fb0 to 0xf0859ff8) 9fa0: 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013
00000000 Code: e5843024 e5843028 e584302c 0a000055 (e1d231bc) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception in interrupt Now, the questions: - Is "silabs,brd8023a" the proper compatible string for WFM200S022XNN3, or should we create our own for the bare module, even if just the in-band SDIO IRQ, and an external antenna is in use? - In order to try out the out-of-band IRQ - in hope that it resolves the issue somehow - do we need to create custom PDS file? With the IRQ enabled, probe fails with "Chip did not answer" error. - Tracing memory corruptions is hard - is there a mechanism that could help us out better than generic methods like kprobes, or implementing canaries? As skb's are heavily re-used for performance reasons, tracing their lifecycle is especially hard. Our first idea was to lock their respective pages from writing, once they are enqueued in the wfx TX queue, so MMU detects the corruption at the exact time it happens, but we
haven't figure out how to modify skb allocator to accomplish that, especially given that the issue mostly happens when transmitting, so skbs are allocated outside of the driver. Maybe there exists a similar mechanism - that could help us out - even if just in the works? Any help will be greatly appreciated - we'll be very happy to provide a patch if we manage to figure the issue out.
--
Pozdrawiam/With kind regards,
Lech Perczak
Sr. Software Engineer
Camlin Technologies Poland Limited Sp. z o.o.
Strzegomska 54,
53-611 Wroclaw
Tel: (+48) 71 75 000 16
Email: lech.perczak@...lingroup.com
Website: http://www.camlingroup.com
Powered by blists - more mailing lists