lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Feb 2022 18:09:36 -0800
From:   Wesley Cheng <wcheng@...eaurora.org>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        Wesley Cheng <quic_wcheng@...cinc.com>,
        Jung Daehwan <dh10.jung@...sung.com>
Cc:     Felipe Balbi <balbi@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>
Subject: Re: [PATCH v2 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING in
 ep cmd fails

Hi Thinh,

On 2/28/2022 5:09 PM, Thinh Nguyen wrote:
> Hi Wesley,
> 
> Wesley Cheng wrote:
>> Hi Thinh,
>>
>> On 2/24/2022 5:44 PM, Thinh Nguyen wrote:
>>> Wesley Cheng wrote:
>>>> Hi Thinh,
>>>>
>>>> On 2/24/2022 5:26 PM, Thinh Nguyen wrote:
>>>>> Hi Wesley,
>>>>>
>>>>> Wesley Cheng wrote:
>>>>>> Hi Thinh,
>>>>>>
>>>>>> On 2/22/2022 6:59 PM, Thinh Nguyen wrote:
>>>>>>> Hi Wesley,
>>>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>>>
>>>>>>> Can you provide a full trace?
>>>>>>>
>>>>>> Sorry for doing this, but we didn't have a way to share the file
>>>>>> directly, so I have to paste the trace below. :/
>>>>>>
>>>>>>> Btw, when you experiment with increasing the timeout to 10 seconds, did
>>>>>>> you use the mdelay() in each loop? The delta time for End Transfer
>>>>>>> timeout here seems very small.
>>>>>>>
>>>>>> This particular run did not modify the timeout for the cmd status polling.
>>>>>>
>>>>>> Is that something that you want to try?  Not pertaining to this dequeue
>>>>>> situation, during the pullup disable path debug, I noticed that as long
>>>>>> as there was a pending EP0 SETUP packet (not data/status) cached in the
>>>>>> controller, it would also have endxfer timeouts on non-control EPs.  We
>>>>>> were able to capture a bus sniffer log + ftrace, and it showed that at
>>>>>> the point of the error, the lecroy/bus log saw a SETUP token ACK'ed,
>>>>>> while on the ftrace, there were endxfer timeouts for all non-control EPs
>>>>>> (due to the dwc3_gadget_stop_active_transfers() call)
>>>>>>
>>>>>> Thanks
>>>>>> Wesley
>>>>>>
>>>>>> TRACE
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>> [ 1023.912550050       0x4af666fd2]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep4in)
>>>>>> [ 1023.912934738       0x4af668cab]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1023.913361561       0x4af66acae]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1024.336480573       0x4afe2a29b]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep5in)
>>>>>> [ 1024.336855573       0x4afe2bebc]   dbg_send_ep_cmd: ep5in: cmd 'Clear
>>>>>> Stall' [405] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1024.337264531       0x4afe2dd68]   dbg_send_ep_cmd: ep0in: cmd 'Start
>>>>>> Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
>>>>>> [ 1024.351910156       0x4afe727d4]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1024.363139896       0x4afea720f]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep4in)
>>>>>> [ 1024.363898281       0x4afeaaaf0]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1024.364734010       0x4afeae99e]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1024.776173803       0x4b063739a]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep5in)
>>>>>> [ 1024.776907188       0x4b063aa9b]   dbg_send_ep_cmd: ep5in: cmd 'Clear
>>>>>> Stall' [405] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1024.777646147       0x4b063e208]   dbg_send_ep_cmd: ep0in: cmd 'Start
>>>>>> Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
>>>>>> [ 1024.788641980       0x4b0671ab8]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1024.799682397       0x4b06a56bf]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep4in)
>>>>>> [ 1024.800442605       0x4b06a8fc3]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1024.801184584       0x4b06ac769]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1025.222173752       0x4b0e61d99]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep5in)
>>>>>> [ 1025.222991669       0x4b0e65af1]   dbg_send_ep_cmd: ep5in: cmd 'Clear
>>>>>> Stall' [405] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1025.223850731       0x4b0e69b5f]   dbg_send_ep_cmd: ep0in: cmd 'Start
>>>>>> Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
>>>>>> [ 1025.239023492       0x4b0eb0d54]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1025.248963179       0x4b0edf6ce]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep4in)
>>>>>> [ 1025.249709846       0x4b0ee2ecd]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1025.250533440       0x4b0ee6c93]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1025.667084535       0x4b16875e7]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep5in)
>>>>>> [ 1025.667827764       0x4b168ada6]   dbg_send_ep_cmd: ep5in: cmd 'Clear
>>>>>> Stall' [405] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1025.668635993       0x4b168ea44]   dbg_send_ep_cmd: ep0in: cmd 'Start
>>>>>> Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
>>>>>> [ 1025.674777764       0x4b16ab6e7]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1025.684733597       0x4b16da197]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep4in)
>>>>>> [ 1025.685336150       0x4b16dcec7]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1025.686112608       0x4b16e0904]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>>>>>> Successful
>>>>>> [ 1025.698842347       0x4b171c3bf]   dbg_send_ep_cmd: ep0out: cmd 'Set
>>>>>> Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1026.444051569       0x4b24c1670]   dbg_send_ep_cmd: ep0out: cmd
>>>>>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: No
>>>>>> Resource
>>>>>
>>>>> Why did this fail with No Resource? Was it sent before the control TRB
>>>>> completion?
>>>>>
>>>>>> [ 1027.225705425       0x4b3311678]   dbg_trace_log_ctrl: Clear Endpoint
>>>>>> Feature(Halt ep5in)
>>>>>> [ 1027.225820165       0x4b3311f14]   dbg_send_ep_cmd: ep5in: cmd 'Clear
>>>>>> Stall' [405] params 00000000 00000000 00000000 --> status: Successful
>>>>>> [ 1027.225931467       0x4b331276c]   dbg_send_ep_cmd: ep0in: cmd 'Start
>>>>>> Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
>>>>>> [ 1027.234311467       0x4b3339bed]   dbg_ep_dequeue: ep7in: req
>>>>>> ffffff87f99b8f00 length 0/61696 zsI ==> -115
>>>>>> [ 1027.247409123       0x4b337723f]   dbg_send_ep_cmd: ep7in: cmd 'End
>>>>>> Transfer' [f0d08] params 00000000 00000000 00000000 --> status: Timed Out
>>>>>>
>>>>>
>>>>> If the Start Transfer for ep0 failed and the Setup packet can't be
>>>>> DMA'ed out, the End Transfer may not go through.
>>>>>
>>>>> Can you leave the log unfiltered and is it possible to send as
>>>>> attachment? It's missing some important info in this log.
>>>>>
>>>> Are there any specific logs you were looking for?
>>>> trace_dwc3_event
>>>> trace_dwc3_prepare_trb
>>>>
>>>> Part of reproducing this issue will not allow me to enable tracing, so I
>>>> have to make changes in our driver to reroute the logs elsewhere.  So
>>>> far I've added changes to do:
>>>> 	trace_dwc3_ctrl_req
>>>> 	trace_dwc3_ep_queue
>>>> 	trace_dwc3_ep_dequeue
>>>> 	trace_dwc3_gadget_giveback
>>>> 	trace_dwc3_gadget_ep_cmd
>>>>
>>>> I can add the above 2 tracepoints as well, but wondering if you needed
>>>> anything else.
>>>>
>>>
>>> If possible, please have these in addition to the above.
>>>
>>> trace_dwc3_event
>>> trace_dwc3_complete_trb
>>>
>> Enabled the traces you recommended.  I think it captures it pretty clear
>> where the status phase was still in progress when the dequeue came in,
>> which led to the end transfer timeout.  Let me know if you need another log.
>>
> 
> Thanks for the trace! I think I see the problem.
> 
> [ 2181.481956865       0x9dc63f265]   dbg_complete: ep6in: trb ffffffc01e7f52a0 (E43:D43) buf 00000000ebaf0000 size 1x 0 ctrl 00000810 (hlcs:sC:normal)
> [ 2181.482044730       0x9dc63f8fc]   dbg_gadget_giveback: ep6in: req ffffff8860657500 length 8/8 zsI ==> 0
> [ 2181.482222490       0x9dc640651]   event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase]
> [ 2181.482273271       0x9dc640a20]   dbg_trace_log_ctrl: Get Interface Status(Intf = 4, Length = 20)
> [ 2181.482334782       0x9dc640ebc]   dbg_ep_queue: ep6in: req ffffff8860657500 length 0/8 zsI ==> -115
> [ 2181.482357386       0x9dc64106e]   dbg_prepare: ep6in: trb ffffffc01e7f52b0 (E44:D43) buf 00000000ea578000 size 1x 8 ctrl 00000811 (Hlcs:sC:normal)
> [ 2181.482391865       0x9dc641304]   dbg_send_ep_cmd: ep6in: cmd 'Update Transfer' [d0007] params 00000000 00000000 00000000 --> status: Successful
> [ 2181.482485615       0x9dc641a0d]   dbg_send_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
> [ 2181.482565303       0x9dc642006]   event (000010c0): ep0out: Transfer Not Ready [0] (Not Active) [Data Phase]
> [ 2181.482719417       0x9dc642b96]   event (00002040): ep0out: Transfer Complete (Sil) [Data Phase]
> [ 2181.482814938       0x9dc6432c0]   dbg_gadget_giveback: ep0out: req ffffff87df84d900 length 20/20 zsI ==> 0
> [ 2181.482926084       0x9dc643b16]   event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) [Status Phase]
> [ 2181.483024261       0x9dc644272]   dbg_send_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful
> 
> The control status isn't completed here.
> 
> [ 2181.483069521       0x9dc6445d7]   dbg_ep_dequeue: ep2in: req ffffff879f5a8b00 length 0/63680 zsI ==> -115
> [ 2181.496068792       0x9dc6814c9]   dbg_send_ep_cmd: ep2in: cmd 'End Transfer' [50d08] params 00000000 00000000 00000000 --> status: Timed Out
> 
> But the dequeue may come when host already sent a new Setup packet.
> The ep0out hasn't started yet at the point.
> 
> Due to various system latency, I can see that this can happen when
> the dwc3 driver hasn't received the interrupt notified the status stage
> event yet.
> 
> If that's the case, the host may have already sent the Setup packet
> at this point. So the End Transfer may get stuck if the Setup packet
> isn't DMA out yet.
> 
> Can you try the change below to see if it resolves the issue?
Thanks, Thinh.  Sure I'll give it a try with this change.  This is very
similar to the change proposed here as well:

https://lore.kernel.org/linux-usb/20220216000835.25400-3-quic_wcheng@quicinc.com/

One thing to mention is that, I'm not sure how dependable checking soley
the ep0state would be.  I've seen some scenarios where we'd run into the
end transfer timeout during the time between inspecting the SETUP packet
(dwc3_ep0_inspect_setup()) and when the data phase is queued.  The
timing of the data phase can potentially differ if it is a vendor
specific control request.

This is what led me to try w/ marking the endxfer delay after getting a
timed out scenario.  Just something to consider :).  I'll let you know
how it goes!  Thanks again.

Thanks
Wesley Cheng

> Code is not tested. Let me know if the logic makes sense. The change is
> to delay End Transfer when the Start Transfer for ep0out is completed.
> 
> Thanks,
> Thinh
> 
> 
> 
> 
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index eb9c1efced05..c55eddff0239 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -736,6 +736,7 @@ struct dwc3_ep {
>  #define DWC3_EP_FIRST_STREAM_PRIMED    BIT(10)
>  #define DWC3_EP_PENDING_CLEAR_STALL    BIT(11)
>  #define DWC3_EP_TXFIFO_RESIZED         BIT(12)
> +#define DWC3_EP_DELAY_STOP             BIT(13)
>  
>         /* This last one is specific to EP0 */
>  #define DWC3_EP0_DIR_IN                        BIT(31)
> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
> index 658739410992..ccd1bf3e1f18 100644
> --- a/drivers/usb/dwc3/ep0.c
> +++ b/drivers/usb/dwc3/ep0.c
> @@ -271,6 +271,7 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
>  {
>         struct dwc3_ep                  *dep;
>         int                             ret;
> +       int                             i;
>  
>         complete(&dwc->ep0_in_setup);
>  
> @@ -279,6 +280,19 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
>                         DWC3_TRBCTL_CONTROL_SETUP, false);
>         ret = dwc3_ep0_start_trans(dep);
>         WARN_ON(ret < 0);
> +
> +       for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
> +               struct dwc3_ep *dwc3_ep;
> +               dwc3_ep = dwc->eps[i];
> +               if (!dwc3_ep)
> +                       continue;
> +
> +               if (!(dwc3_ep->flags & DWC3_EP_DELAY_STOP))
> +                       continue;
> +
> +               dep->flags &= ~DWC3_EP_DELAY_STOP;
> +               dwc3_stop_active_transfer(dwc3_ep, true, true);
> +       }
>  }
>  
>  static struct dwc3_ep *dwc3_wIndex_to_dep(struct dwc3 *dwc, __le16 wIndex_le)
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 183b90923f51..e164ecad06ac 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -654,9 +654,6 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action)
>         return dwc3_send_gadget_ep_cmd(dep, DWC3_DEPCMD_SETEPCONFIG, &params);
>  }
>  
> -static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
> -               bool interrupt);
> -
>  /**
>   * dwc3_gadget_calc_tx_fifo_size - calculates the txfifo size value
>   * @dwc: pointer to the DWC3 context
> @@ -1898,6 +1895,7 @@ static int __dwc3_gadget_ep_queue(struct dwc3_ep *dep, struct dwc3_request *req)
>          * and endpoint STALL is cleared.
>          */
>         if ((dep->flags & DWC3_EP_END_TRANSFER_PENDING) ||
> +           (dep->flags & DWC3_EP_DELAY_STOP) ||
>             (dep->flags & DWC3_EP_WEDGE) ||
>             (dep->flags & DWC3_EP_STALL)) {
>                 dep->flags |= DWC3_EP_DELAY_START;
> @@ -2033,6 +2031,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>                 if (r == req) {
>                         struct dwc3_request *t;
>  
> +                       if (dwc->ep0state != EP0_SETUP_PHASE)
> +                               dep->flags |= DWC3_EP_DELAY_STOP;
> +
>                         /* wait until it is processed */
>                         dwc3_stop_active_transfer(dep, true, true);
>  
> @@ -2116,7 +2117,8 @@ int __dwc3_gadget_ep_set_halt(struct dwc3_ep *dep, int value, int protocol)
>                 list_for_each_entry_safe(req, tmp, &dep->started_list, list)
>                         dwc3_gadget_move_cancelled_request(req, DWC3_REQUEST_STATUS_STALLED);
>  
> -               if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) {
> +               if ((dep->flags & DWC3_EP_END_TRANSFER_PENDING) ||
> +                   (dep->flags & DWC3_EP_DELAY_STOP)) {
>                         dep->flags |= DWC3_EP_PENDING_CLEAR_STALL;
>                         return 0;
>                 }
> @@ -3596,7 +3598,7 @@ static void dwc3_reset_gadget(struct dwc3 *dwc)
>         }
>  }
>  
> -static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
> +void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
>         bool interrupt)
>  {
>         struct dwc3_gadget_ep_cmd_params params;
> @@ -3604,6 +3606,7 @@ static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
>         int ret;
>  
>         if (!(dep->flags & DWC3_EP_TRANSFER_STARTED) ||
> +           (dep->flags & DWC3_EP_DELAY_STOP) ||
>             (dep->flags & DWC3_EP_END_TRANSFER_PENDING))
>                 return;
>  
> diff --git a/drivers/usb/dwc3/gadget.h b/drivers/usb/dwc3/gadget.h
> index 77df4b6d6c13..f763380e672e 100644
> --- a/drivers/usb/dwc3/gadget.h
> +++ b/drivers/usb/dwc3/gadget.h
> @@ -116,6 +116,7 @@ int dwc3_gadget_ep0_queue(struct usb_ep *ep, struct usb_request *request,
>                 gfp_t gfp_flags);
>  int __dwc3_gadget_ep_set_halt(struct dwc3_ep *dep, int value, int protocol);
>  void dwc3_ep0_send_delayed_status(struct dwc3 *dwc);
> +void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt);
>  
>  /**
>   * dwc3_gadget_ep_get_transfer_index - Gets transfer index from HW
> 
> 
>  
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ