[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEf/2+3MU5ED2sxE@dragon>
Date: Tue, 10 Jun 2025 17:50:19 +0800
From: Shawn Guo <shawnguo2@...h.net>
To: Xu Yang <xu.yang_2@....com>
Cc: Peter Chen <peter.chen@...nel.org>, Shawn Guo <shawnguo@...nel.org>,
imx@...ts.linux.dev, linux-usb@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: i.MX kernel hangup caused by chipidea USB gadget driver
On Mon, Jun 09, 2025 at 07:53:22PM +0800, Xu Yang wrote:
<snip>
> During the scp process, the usb host won't put usb device to suspend state.
> In current design, then the ether driver doesn't know the system has
> suspended after echo mem. The root cause is that ether driver is still tring
> to queue usb request after usb controller has suspended where usb clock is off,
> then the system hang.
>
> With the above changes, I think the ether driver will fail to eth_start_xmit()
> at an ealier stage, so the issue can't be triggered.
>
> I think the ether driver needs call gether_suspend() accordingly, to do this,
> the controller driver need explicitly call suspend() function when it's going
> to be suspended. Could you check whether below patch fix the issue?
Thanks for the patch, Xu! It does fix the hangup but seems to be less
reliable than my/Peter's change (disconnecting gadget), per my testing
on a custom i.MX8MM board. With your change, host/PC doesn't disconnect
gadget when the board suspends. After a few suspend cycles, Ethernet
gadget stops working and the following workqueue lockup is seen. There
seems to some be other bugs?
[ 223.047990] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 223.054097] rcu: 1-...0: (7 ticks this GP) idle=bb7c/1/0x4000000000000000 softirq=5368/5370 fqs=2431
[ 223.063318] rcu: (detected by 0, t=5252 jiffies, g=4705, q=2400 ncpus=4)
[ 223.070105] Task dump for CPU 1:
[ 223.073330] task:systemd-network state:R running task stack:0 pid:406 ppid:1 flags:0x00000202
[ 223.083248] Call trace:
[ 223.085692] __switch_to+0xc0/0x124
[ 246.747996] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 43s!
However, your change seems working fine on i.MX8MM EVK. It's probably
due to the fact that host disconnects gadget for some reason when EVK
suspends. This is a different behavior from the custom board above.
We do not really expect this disconnecting, do we?
Shawn
> ---8<--------------------
>
> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> index 8a9b31fd5c89..27a7674ed62c 100644
> --- a/drivers/usb/chipidea/udc.c
> +++ b/drivers/usb/chipidea/udc.c
> @@ -2367,6 +2367,8 @@ static void udc_id_switch_for_host(struct ci_hdrc *ci)
> #ifdef CONFIG_PM_SLEEP
> static void udc_suspend(struct ci_hdrc *ci)
> {
> + ci->driver->suspend(&ci->gadget);
> +
> /*
> * Set OP_ENDPTLISTADDR to be non-zero for
> * checking if controller resume from power lost
> @@ -2389,6 +2391,8 @@ static void udc_resume(struct ci_hdrc *ci, bool power_lost)
> /* Restore value 0 if it was set for power lost check */
> if (hw_read(ci, OP_ENDPTLISTADDR, ~0) == 0xFFFFFFFF)
> hw_write(ci, OP_ENDPTLISTADDR, ~0, 0);
> +
> + ci->driver->resume(&ci->gadget);
> }
> #endif
>
> ---->8------------------
Powered by blists - more mailing lists