lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231012175912.umc3ugzk4iqwtcp3@synopsys.com>
Date:   Thu, 12 Oct 2023 17:59:29 +0000
From:   Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To:     Krishna Kurapati <quic_kriskura@...cinc.com>
CC:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "quic_ppratap@...cinc.com" <quic_ppratap@...cinc.com>,
        "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>,
        "quic_ugoswami@...cinc.com" <quic_ugoswami@...cinc.com>
Subject: Re: [RFC] usb: dwc3: core: Fix RAM interface getting stuck during
 enumeration

On Wed, Oct 11, 2023, Krishna Kurapati wrote:
> This implementation is to fix RAM interface getting stuck during
> enumeration and controller not responding to any command.
> 
> During plug-out test cases, it is sometimes seen that no events
> are generated by the controller and all CSR register reads give "0"
> and CSR_Timeout bit gets set indicating that CSR reads/writes are
> timing out or timed out.
> 
> The issue comes up on different instnaces of enumeration on different
> platforms. On one platform, the debug log is as follows:
> 
> Prepared a TRB on ep0out and did start transfer to get set
> address request from host:
> 
> <...>-7191    [000] D..1.    66.421006: dwc3_gadget_ep_cmd: ep0out:
> cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 -->
> status: Successful
> 
> <...>-7191    [000] D..1.    66.421196: dwc3_event: event (0000c040):
> ep0out: Transfer Complete (sIL) [Setup Phase]
> 
> <...>-7191    [000] D..1.    66.421197: dwc3_ctrl_req: Set
> Address(Addr = 01)
> 
> Then XFER NRDY received on ep0in for zero length status phase and
> a Start Transfer was done on ep0in with 0-length packet in 2 Stage
> status phase:
> 
> <...>-7191    [000] D..1.    66.421249: dwc3_event: event (000020c2):
> ep0in: Transfer Not Ready [00000000] (Not Active) [Status Phase]
> 
> <...>-7191    [000] D..1.    66.421266: dwc3_prepare_trb: ep0in: trb
> ffffffc00fcfd000 (E0:D0) buf 00000000efffa000 size 0 ctrl 00000c33
> sofn 00000000 (HLcs:SC:status2)
> 
> <...>-7191    [000] D..1.    66.421387: dwc3_gadget_ep_cmd: ep0in: cmd
> 'Start Transfer' [406] params 00000000 efffa000 00000000 -->status:
> Successful
> 
> Then a bus reset was received directly after 500 msec. Software never
> got the cmd complete for the start transfer done in status phase. Here
> the RAM interface is stuck. So host issues a bus reset as link is
> idle for 500 msec:
> 
> <...>-7191    [000] D..1.    66.935603: dwc3_event: event (00000101):
> Reset [U0]
> 
> Then software sees that it is in status phase and we issue an ENDXFER
> on ep0in and it gets timedout waiting for the CMDACT to go '0':
> 
> <...>-7191    [000] D..1.    66.958249: dwc3_gadget_ep_cmd: ep0in: cmd
> 'End Transfer' [10508] params 00000000 00000000 00000000 --> status:
> Timed Out
> 
> Upon debug with Synopsys, it turns out that the root cause is as
> follows:
> 
> During any transfer, if the data is not successfully transmitted,
> then a Done (with failure) handshake is returned, so that the BMU
> can re-attempt the same data again by rewinding its data pointers.
> 
> But, if the USB IN is a 0-length payload (which is what is happening
> in this case - 2 stage status phase of set_address), then there is no
> need to rewind the pointers and the Done (with failure) handshake is
> not returned for failure case. This keeps the Request-Done interface
> busy till the next Done handshake. The MAC sends the 0-length payload
> again when the host requests. If the transmission is successful this
> time, the Done (with success) handshake is provided back. Otherwise,
> it repeats the same steps again.
> 
> If the cable is disconnected or if the Host aborts the transfer on 3
> consecutive failed attempts, the Request-Done handshake is not
> complete. This keeps the interface busy.
> 
> The subsequent RAM access cannot proceed until the above pending
> transfer is complete. This results in failure of any access to RAM
> address locations. Many of the EndPoint commands need to access the
> RAM and they would fail to complete successfully.
> 
> Furthermore when cable removal happens, this would not generate a
> disconnect event and the "connected" flag remains true always blockin
> suspend.
> 
> Synopsys confirmed that the issue is present on all USB3 devices and
> as a workaround, suggested to re-initialize device mode.
> 
> Signed-off-by: Krishna Kurapati <quic_kriskura@...cinc.com>
> ---
>  drivers/usb/dwc3/core.c   | 20 ++++++++++++++++++++
>  drivers/usb/dwc3/core.h   |  4 ++++
>  drivers/usb/dwc3/drd.c    |  5 +++++
>  drivers/usb/dwc3/gadget.c |  6 ++++--
>  4 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index 44ee8526dc28..d18b81cccdc5 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -122,6 +122,7 @@ static void __dwc3_set_mode(struct work_struct *work)
>  	unsigned long flags;
>  	int ret;
>  	u32 reg;
> +	u8 timeout = 100;
>  	u32 desired_dr_role;
>  
>  	mutex_lock(&dwc->mutex);
> @@ -137,6 +138,25 @@ static void __dwc3_set_mode(struct work_struct *work)
>  	if (!desired_dr_role)
>  		goto out;
>  
> +	/*
> +	 * STAR 5001544 - If cable disconnect doesn't generate
> +	 * disconnect event in device mode, then re-initialize the
> +	 * controller.
> +	 */
> +	if ((dwc->cable_disconnected == true) &&
> +		(dwc->current_dr_role == DWC3_GCTL_PRTCAP_DEVICE)) {
> +		while (dwc->connected == true && timeout != 0) {
> +			mdelay(10);
> +			timeout--;
> +		}
> +
> +		if (timeout == 0) {
> +			dwc3_gadget_soft_disconnect(dwc);
> +			udelay(100);
> +			dwc3_gadget_soft_connect(dwc);
> +		}
> +	}
> +
>  	if (desired_dr_role == dwc->current_dr_role)
>  		goto out;
>  
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index c6c87acbd376..7642330cf608 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -1355,6 +1355,7 @@ struct dwc3 {
>  	int			last_fifo_depth;
>  	int			num_ep_resized;
>  	struct dentry		*debug_root;
> +	bool			cable_disconnected;
>  };
>  
>  #define INCRX_BURST_MODE 0
> @@ -1568,6 +1569,9 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc);
>  
>  int dwc3_core_soft_reset(struct dwc3 *dwc);
>  
> +int dwc3_gadget_soft_disconnect(struct dwc3 *dwc);
> +int dwc3_gadget_soft_connect(struct dwc3 *dwc);
> +
>  #if IS_ENABLED(CONFIG_USB_DWC3_HOST) || IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE)
>  int dwc3_host_init(struct dwc3 *dwc);
>  void dwc3_host_exit(struct dwc3 *dwc);
> diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
> index 039bf241769a..593c023fc39a 100644
> --- a/drivers/usb/dwc3/drd.c
> +++ b/drivers/usb/dwc3/drd.c
> @@ -446,6 +446,8 @@ static int dwc3_usb_role_switch_set(struct usb_role_switch *sw,
>  	struct dwc3 *dwc = usb_role_switch_get_drvdata(sw);
>  	u32 mode;
>  
> +	dwc->cable_disconnected = false;
> +
>  	switch (role) {
>  	case USB_ROLE_HOST:
>  		mode = DWC3_GCTL_PRTCAP_HOST;
> @@ -454,6 +456,9 @@ static int dwc3_usb_role_switch_set(struct usb_role_switch *sw,
>  		mode = DWC3_GCTL_PRTCAP_DEVICE;
>  		break;
>  	default:
> +		if (role == USB_ROLE_NONE)
> +			dwc->cable_disconnected = true;
> +
>  		if (dwc->role_switch_default_mode == USB_DR_MODE_HOST)
>  			mode = DWC3_GCTL_PRTCAP_HOST;
>  		else
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 858fe4c299b7..a92df2e04cce 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2634,7 +2634,7 @@ static void dwc3_gadget_disable_irq(struct dwc3 *dwc);
>  static void __dwc3_gadget_stop(struct dwc3 *dwc);
>  static int __dwc3_gadget_start(struct dwc3 *dwc);
>  
> -static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
> +int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>  {
>  	unsigned long flags;
>  	int ret;
> @@ -2701,7 +2701,7 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>  	return ret;
>  }
>  
> -static int dwc3_gadget_soft_connect(struct dwc3 *dwc)
> +int dwc3_gadget_soft_connect(struct dwc3 *dwc)
>  {
>  	int ret;
>  
> @@ -3963,6 +3963,7 @@ static void dwc3_gadget_disconnect_interrupt(struct dwc3 *dwc)
>  	dwc3_gadget_dctl_write_safe(dwc, reg);
>  
>  	dwc->connected = false;
> +	dwc->cable_disconnected = true;
>  
>  	dwc3_disconnect_gadget(dwc);
>  
> @@ -4038,6 +4039,7 @@ static void dwc3_gadget_reset_interrupt(struct dwc3 *dwc)
>  	 */
>  	dwc3_stop_active_transfers(dwc);
>  	dwc->connected = true;
> +	dwc->cable_disconnected = false;
>  
>  	reg = dwc3_readl(dwc->regs, DWC3_DCTL);
>  	reg &= ~DWC3_DCTL_TSTCTRL_MASK;
> -- 
> 2.42.0
> 

We can just reset the controller when there's End Transfer command
timeout as a failure recovery. No need to do what you're doing here.

BR,
Thinh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ