[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zd2KBc6uDj2G5gZi@nanopsycho>
Date: Tue, 27 Feb 2024 08:06:45 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jakub Raczynski <j.raczynski@...sung.com>
Cc: netdev@...r.kernel.org, kuba@...nel.org, alexandre.torgue@...s.st.com,
joabreu@...opsys.com
Subject: Re: [PATCH net v2] stmmac: Clear variable when destroying workqueue
Mon, Feb 26, 2024 at 05:42:32PM CET, j.raczynski@...sung.com wrote:
>Currently when suspending driver and stopping workqueue it is checked whether
>workqueue is not NULL and if so, it is destroyed.
>Function destroy_workqueue() does drain queue and does clear variable, but
>it does not set workqueue variable to NULL. This can cause kernel/module
>panic if code attempts to clear workqueue that was not initialized.
>
>This scenario is possible when resuming suspended driver in stmmac_resume(),
>because there is no handling for failed stmmac_hw_setup(),
>which can fail and return if DMA engine has failed to initialize,
>and workqueue is initialized after DMA engine.
>Should DMA engine fail to initialize, resume will proceed normally,
>but interface won't work and TX queue will eventually timeout,
>causing 'Reset adapter' error.
>This then does destroy workqueue during reset process.
>And since workqueue is initialized after DMA engine and can be skipped,
>it will cause kernel/module panic.
>
>To secure against this possible crash, set workqueue variable to NULL when
>destroying workqueue.
>
>Log/backtrace from crash goes as follows:
>[88.031977]------------[ cut here ]------------
>[88.031985]NETDEV WATCHDOG: eth0 (sxgmac): transmit queue 1 timed out
>[88.032017]WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x390/0x398
> <Skipping backtrace for watchdog timeout>
>[88.032251]---[ end trace e70de432e4d5c2c0 ]---
>[88.032282]sxgmac 16d88000.ethernet eth0: Reset adapter.
>[88.036359]------------[ cut here ]------------
>[88.036519]Call trace:
>[88.036523] flush_workqueue+0x3e4/0x430
>[88.036528] drain_workqueue+0xc4/0x160
>[88.036533] destroy_workqueue+0x40/0x270
>[88.036537] stmmac_fpe_stop_wq+0x4c/0x70
>[88.036541] stmmac_release+0x278/0x280
>[88.036546] __dev_close_many+0xcc/0x158
>[88.036551] dev_close_many+0xbc/0x190
>[88.036555] dev_close.part.0+0x70/0xc0
>[88.036560] dev_close+0x24/0x30
>[88.036564] stmmac_service_task+0x110/0x140
>[88.036569] process_one_work+0x1d8/0x4a0
>[88.036573] worker_thread+0x54/0x408
>[88.036578] kthread+0x164/0x170
>[88.036583] ret_from_fork+0x10/0x20
>[88.036588]---[ end trace e70de432e4d5c2c1 ]---
>[88.036597]Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004
>
>Fixes: 5a5586112b929 ("net: stmmac: support FPE link partner hand-shaking procedure")
>Signed-off-by: Jakub Raczynski <j.raczynski@...sung.com>
Reviewed-by: Jiri Pirko <jiri@...dia.com>
Next time, send v2 as a separate email starting new thread. Thanks!
Powered by blists - more mailing lists