[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d3470af-5d75-a989-dc21-00f42c022af0@quicinc.com>
Date: Thu, 24 Mar 2022 11:53:20 -0700
From: Wesley Cheng <quic_wcheng@...cinc.com>
To: Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
Wesley Cheng <wcheng@...eaurora.org>,
"balbi@...nel.org" <balbi@...nel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"quic_jackp@...cinc.com" <quic_jackp@...cinc.com>
Subject: Re: [RFC PATCH v2 1/3] usb: dwc3: Flush pending SETUP data during
stop active xfers
Hi Thinh,
On 3/23/2022 5:39 PM, Thinh Nguyen wrote:
> Hi Wesley,
>
> Wesley Cheng wrote:
>> Hi Thinh,
>>
>> On 3/22/2022 6:31 PM, Thinh Nguyen wrote:
>>> Hi Wesley,
>>>
>>> Wesley Cheng wrote:
>>>> Hi Thinh,
>>>>
>>>> On 2/15/2022 4:08 PM, Wesley Cheng wrote:
>>>>> While running the pullup disable sequence, if there are pending SETUP
>>>>> transfers stored in the controller, then the ENDTRANSFER commands on non
>>>>> control eps will fail w/ ETIMEDOUT. As a suggestion from SNPS, in order
>>>>> to drain potentially cached SETUP packets, SW needs to issue a
>>>>> STARTTRANSFER command. After issuing the STARTTRANSFER, and retrying the
>>>>> ENDTRANSFER, the command should succeed. Else, if the endpoints are not
>>>>> properly stopped, the controller halt sequence will fail as well.
>>>>>
>>>>> One limitation is that the current logic will drop the SETUP data
>>>>> being received (ie dropping the SETUP packet), however, it should be
>>>>> acceptable in the pullup disable case, as the device is eventually
>>>>> going to disconnect from the host.
>>>>>
>>>>> Signed-off-by: Wesley Cheng <quic_wcheng@...cinc.com>
>>>>> ---
>>>> Just wondering if you had any inputs for this particular change? I
>>>> think on the dequeue path discussion you had some concerns with parts of
>>>> this change? I think the difficult part for the pullup disable path is
>>>> that we have this API running w/ interrupts disabled, so the EP0 state
>>>> won't be able to advance compared to the dequeue case.
>>>
>>> This doesn't sound right. The pullup disable (or device initiated
>>> disconnect) should wait for the EP0 state to be EP0_SETUP_PHASE before
>>> proceeding, which it does.
>>>
>> That is correct, but even though that check is passed, it doesn't
>> prevent the host from sending another SETUP token between the pending
>> setup packet check and run/stop clear.
>>
>
> That should be fine, because we would have the TRB ready for that SETUP
> packet.
>
I agree, its valid for the controller to be able to receive the next
setup packet.
>>>>
>>>> Ideally, if there is a way to ensure that we avoid reading further setup
>>>> packets, that would be nice, but from our discussions with Synopsys,
>>>> this was not possible. (controller is always armed and ready to ACK
>>>> setup tokens)
>>>>
>>>> I did evaluate keeping IRQs enabled and periodically releasing/attaining
>>>> the lock between the stop active xfer calls, but that opened another can
>>>> of worms. If you think this is the approach we should take, I can take
>>>> a look at this implementation further.
>>>>
>>>
>>> This patch doesn't look right to me. The change I suggested before
>>> should address this (I believe Greg already picked it up). What other
>>> problem do you see, I'm not clear what's the problem here. One potential
>>> problem that I can see is that currently dwc3 driver doesn't wait for
>>> active endpoints to complete/end before clearing the run_stop bit on
>>> device initiated disconnect, but I'm not sure if that's what you're seeing.
>>>
>>> Please help clarify further. If there's trace points log, that'd help.
>>>
>> Main difference between the issue Greg recently pulled in and this one
>> is that this is on the pullup disable patch (no dequeue involved). In
>> this situation we will also stop active transfers, so that the
>> controller can halt properly.
>>
>> I attached a few files, and will give a summary of them below:
>> 1. pullup_disable_timeout.txt - ftrace of an instance of when we see
>> several endxfer timeouts. Refer to line#2016.
>>
>> 2. lecroy+ftrace_snip.png - picture showing a matching bus sniffer log
>> and a ftrace collected (difference instance to #1)
>>
>> #2 will show that we completed a SETUP transfer before entering into
>> dwc3_gadget_stop_active_transfers(). In here, we then see DWC ACK
>> another SETUP token, which leads to endxfer timeouts on all subsequent
>> endpoints.
>
> Thanks for the captures. I think the problem here is because the dwc3
> driver disables the control endpoint. It shouldn't do that.
>
> Can you try this:
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index ab725d2262d6..f0b9ea353620 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1010,12 +1010,12 @@ static int __dwc3_gadget_ep_disable(struct
> dwc3_ep *dep)
> if (dep->flags & DWC3_EP_STALL)
> __dwc3_gadget_ep_set_halt(dep, 0, false);
>
> - reg = dwc3_readl(dwc->regs, DWC3_DALEPENA);
> - reg &= ~DWC3_DALEPENA_EP(dep->number);
> - dwc3_writel(dwc->regs, DWC3_DALEPENA, reg);
> -
> - /* Clear out the ep descriptors for non-ep0 */
> if (dep->number > 1) {
> + reg = dwc3_readl(dwc->regs, DWC3_DALEPENA);
> + reg &= ~DWC3_DALEPENA_EP(dep->number);
> + dwc3_writel(dwc->regs, DWC3_DALEPENA, reg);
> +
> + /* Clear out the ep descriptors for non-ep0 */
> dep->endpoint.comp_desc = NULL;
> dep->endpoint.desc = NULL;
> }
>
I was able to reproduce the endxfer timeout w/ the above change. I'm
assuming you didn't want me to include any parts of my change while
testing, right?
Current sequence in dwc3_gadget_pullup(0) is that we should call
dwc3_stop_active_transfers() before we call __dwc3_gadget_stop().
(gadget stop will call the ep disable for EP[0] and EP[1]) This might
be why the issue would still be occurring.
The attached ftrace that captures a situation where a forced BUG will
occur on the first instance of the timeout.
[ 1573.687689437 0x724885c22] event (000020c0): ep0out: Transfer
Not Ready [0] (Not Active) [Status Phase]
[ 1573.687738707 0x724885fd3] dbg_send_ep_cmd: ep0out: cmd
'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
Successful
[ 1573.698040582 0x7248b6477] dbg_send_ep_cmd: ep1out: cmd 'End
Transfer' [20c08] params 00000000 00000000 00000000 --> status: Timed Out
We definitely started the STATUS phase, so host could have read it and
sent the next SETUP packet while we were already in the
dwc3_stop_active_transfers() loop.
Thanks
Wesley Cheng
View attachment "a600000.ssusb.trace.txt" of type "text/plain" (166829 bytes)
Powered by blists - more mailing lists