lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fc6ec7a-ab2d-4b2c-b1f7-7902010c8682@actia.se>
Date: Tue, 10 Jun 2025 15:05:15 +0000
From: John Ernberg <john.ernberg@...ia.se>
To: Xu Yang <xu.yang_2@....com>
CC: Shawn Guo <shawnguo2@...h.net>, Peter Chen <peter.chen@...nel.org>, "Shawn
 Guo" <shawnguo@...nel.org>, "imx@...ts.linux.dev" <imx@...ts.linux.dev>,
	"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: Re: i.MX kernel hangup caused by chipidea USB gadget driver

Hi Xu,

On 6/10/25 1:30 PM, Xu Yang wrote:
> Hi John,
> 
> On Mon, Jun 09, 2025 at 02:17:30PM +0000, John Ernberg wrote:
>> Hi Shawn, Xu,
>>
>> On Mon, Jun 09, 2025 at 07:53:22PM +0800, Xu Yang wrote:
>>> Hi Shawn,
>>>
>>> Thanks for your reports!
>>>
>>> On Mon, Jun 09, 2025 at 01:31:06PM +0800, Shawn Guo wrote:
>>>> Hi Xu, Peter,
>>>>
>>>> I'm seeing a kernel hangup on imx8mm-evk board.  It happens when:
>>>>
>>>>   - USB gadget is enabled as Ethernet
>>>>   - There is data transfer over USB Ethernet
>>>>   - Device is going in/out suspend
>>>>
>>>> A simple way to reproduce the issue could be:
>>>>
>>>>   1. Copy a big file (like 500MB) from host PC to device with scp
>>>>
>>>>   2. While the file copy is ongoing, suspend & resume the device like:
>>>>
>>>>      $ echo +3 > /sys/class/rtc/rtc0/wakealarm; echo mem > /sys/power/state
>>>>
>>>>   3. The device will hang up there
>>>>
>>>> I reproduced on the following kernels:
>>>>
>>>>   - Mainline kernel
>>>>   - NXP kernel lf-6.6.y
>>>>   - NXP kernel lf-6.12.y
>>>>
>>>> But NXP kernel lf-6.1.y doesn't have this problem.  I tracked it down to
>>>> Peter's commit [1] on lf-6.1.y, and found that the gadget disconnect &
>>>> connect calls got lost from suspend & resume hooks, when the commit were
>>>> split and pushed upstream.  I confirm that adding the calls back fixes
>>>> the hangup.
>>
>> We probably ran into the same problem trying to bring onboard 6.12, going
>> from 6.1, on iMX8QXP. I managed to trace the hang to EP priming through a
>> combination of debug tracing and BUG_ON experiments. See if it starts
>> splatin with the below change.
>>
>> ----------------->8------------------
>>
>> >From 092599ab6f9e20412a7ca1eb118dd2be80cd18ff Mon Sep 17 00:00:00 2001
>> From: John Ernberg <john.ernberg@...ia.se>
>> Date: Mon, 5 May 2025 09:09:01 +0200
>> Subject: [PATCH] USB: ci: gadget: Panic if priming when gadget off
>>
>> ---
>>   drivers/usb/chipidea/udc.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
>> index 2fea263a5e30..544aa4fa2d1d 100644
>> --- a/drivers/usb/chipidea/udc.c
>> +++ b/drivers/usb/chipidea/udc.c
>> @@ -203,8 +203,10 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
>>
>>      hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
>>
>> -   while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
>> +   while (hw_read(ci, OP_ENDPTPRIME, BIT(n))) {
>>          cpu_relax();
>> +       BUG_ON(dir == TX && !hw_read(ci, OP_ENDPTCTRL + num, ENDPTCTRL_TXE));
>> +   }
>>      if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
>>          return -EAGAIN;
>>
>> ----------------->8------------------
>>
>> On the iMX8QXP you may additionally run into asychronous aborts and SError
>> due to resource being disabled.
>>
>>>>
>>>> ---8<--------------------
>>>>
>>>> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
>>>> index 8a9b31fd5c89..72329a7eac4d 100644
>>>> --- a/drivers/usb/chipidea/udc.c
>>>> +++ b/drivers/usb/chipidea/udc.c
>>>> @@ -2374,6 +2374,9 @@ static void udc_suspend(struct ci_hdrc *ci)
>>>>           */
>>>>          if (hw_read(ci, OP_ENDPTLISTADDR, ~0) == 0)
>>>>                  hw_write(ci, OP_ENDPTLISTADDR, ~0, ~0);
>>>> +
>>>> +       if (ci->driver && ci->vbus_active && (ci->gadget.state != USB_STATE_SUSPENDED))
>>>> +               usb_gadget_disconnect(&ci->gadget);
>>>>   }
>>>>
>>>>   static void udc_resume(struct ci_hdrc *ci, bool power_lost)
>>>> @@ -2384,6 +2387,9 @@ static void udc_resume(struct ci_hdrc *ci, bool power_lost)
>>>>                                          OTGSC_BSVIS | OTGSC_BSVIE);
>>>>                  if (ci->vbus_active)
>>>>                          usb_gadget_vbus_disconnect(&ci->gadget);
>>>> +       } else {
>>>> +               if (ci->driver && ci->vbus_active)
>>>> +                       usb_gadget_connect(&ci->gadget);
>>>>          }
>>>>
>>>>          /* Restore value 0 if it was set for power lost check */
>>>>
>>>> ---->8------------------
> 
> Does above change work for you?

I hope to allocate some time to test this in the next few days.

Best regards // John Ernberg

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ