lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cabf2de7-bde2-4cec-9f99-2cb8c92875a5@quicinc.com>
Date:   Fri, 13 Oct 2023 09:28:18 +0530
From:   Krishna Kurapati PSSNV <quic_kriskura@...cinc.com>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>
CC:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "quic_ppratap@...cinc.com" <quic_ppratap@...cinc.com>,
        "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>,
        "quic_ugoswami@...cinc.com" <quic_ugoswami@...cinc.com>
Subject: Re: [RFC] usb: dwc3: core: Fix RAM interface getting stuck during
 enumeration



On 10/13/2023 3:47 AM, Thinh Nguyen wrote:
> On Fri, Oct 13, 2023, Krishna Kurapati PSSNV wrote:
>>
>>
>> On 10/12/2023 11:29 PM, Thinh Nguyen wrote:
>>
>>>> -static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>>> +int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>>>    {
>>>>    	unsigned long flags;
>>>>    	int ret;
>>>> @@ -2701,7 +2701,7 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>>>    	return ret;
>>>>    }
>>>> -static int dwc3_gadget_soft_connect(struct dwc3 *dwc)
>>>> +int dwc3_gadget_soft_connect(struct dwc3 *dwc)
>>>>    {
>>>>    	int ret;
>>>> @@ -3963,6 +3963,7 @@ static void dwc3_gadget_disconnect_interrupt(struct dwc3 *dwc)
>>>>    	dwc3_gadget_dctl_write_safe(dwc, reg);
>>>>    	dwc->connected = false;
>>>> +	dwc->cable_disconnected = true;
>>>>    	dwc3_disconnect_gadget(dwc);
>>>> @@ -4038,6 +4039,7 @@ static void dwc3_gadget_reset_interrupt(struct dwc3 *dwc)
>>>>    	 */
>>>>    	dwc3_stop_active_transfers(dwc);
>>>>    	dwc->connected = true;
>>>> +	dwc->cable_disconnected = false;
>>>>    	reg = dwc3_readl(dwc->regs, DWC3_DCTL);
>>>>    	reg &= ~DWC3_DCTL_TSTCTRL_MASK;
>>>> -- 
>>>> 2.42.0
>>>>
>>>
>>> We can just reset the controller when there's End Transfer command
>>> timeout as a failure recovery. No need to do what you're doing here.
>>>
>> Hi Thinh,
>>
>>   That was what I initially wanted to do, but there were couple of reasons I
>> wanted to take this approach:
>>
>> 1. We can't just reset the controller in midst of gadget_interrupt. We need
>> to process it completely and then take action.
> 
> You can flag the driver so you can do the teardown/soft-reset at the
> appropriate time.
> 
>>
>> 2. The above log was seen on QRD variant of SM8550/SM8650 easily. But on
>> other platforms of same targets, the issue comes up at some other instances
>> of code, at a point where no IRQ is running. In such cases its not possible
>> to accurately find out code portions and reset the controller. The way I
>> confirmed that both platforms are having the same issue is:
>>
>> a. During cable disconnect, I am not receiving disconnect interrupt
>> b. The reg dump is exactly same in both cases (BMU as well)
>>
>> So I felt it was better to fix it during cable disconnect because even if we
>> remove cable, we are still in device mode only and in this case we can
>> unblock suspend and also bring back controller to a known state.
>>
>> Let me know your thoughts on the above.
>>
> 
> This issue happens outside of disconnect right? Did you account for port
> reset?
> 
> The symptom should be the same. At some point, a command will be issued.
> If a command timed out, then something is really wrong (especially End
> Transfer command). We can attempt to recover base on this symptom then.
> 
> And you don't need to poll for timeout for this specific type of error.
> Just read some known register like GSNPSID to see if it's invalid.

Hi Thinh,

  Yes. It actually happens before disconnect, but I was trying to handle 
it during disconnect. How about I add a check in gadget_ep_cmd saying 
that if cmd timeout happens, then read the snpsid and if it reads an 
invalid value, then I will call soft_disconnect followed by 
soft_connect. That must cover all cases ?

Regards,
Krishna,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ