[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2025081348-depict-lapel-2e9e@gregkh>
Date: Wed, 13 Aug 2025 16:33:39 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Selvarasu Ganesan <selvarasu.g@...sung.com>
Cc: Thinh.Nguyen@...opsys.com, m.grzeschik@...gutronix.de, balbi@...com,
bigeasy@...utronix.de, linux-usb@...r.kernel.org,
linux-kernel@...r.kernel.org, jh0801.jung@...sung.com,
dh10.jung@...sung.com, akash.m5@...sung.com,
hongpooh.kim@...sung.com, eomji.oh@...sung.com,
shijie.cai@...sung.com, alim.akhtar@...sung.com,
muhammed.ali@...sung.com, thiagu.r@...sung.com,
stable@...r.kernel.org
Subject: Re: [PATCH v3] usb: dwc3: Remove WARN_ON for device endpoint command
timeouts
On Fri, Aug 08, 2025 at 06:23:05PM +0530, Selvarasu Ganesan wrote:
> This commit addresses a rarely observed endpoint command timeout
> which causes kernel panic due to warn when 'panic_on_warn' is enabled
> and unnecessary call trace prints when 'panic_on_warn' is disabled.
> It is seen during fast software-controlled connect/disconnect testcases.
> The following is one such endpoint command timeout that we observed:
>
> 1. Connect
> =======
> ->dwc3_thread_interrupt
> ->dwc3_ep0_interrupt
> ->configfs_composite_setup
> ->composite_setup
> ->usb_ep_queue
> ->dwc3_gadget_ep0_queue
> ->__dwc3_gadget_ep0_queue
> ->__dwc3_ep0_do_control_data
> ->dwc3_send_gadget_ep_cmd
>
> 2. Disconnect
> ==========
> ->dwc3_thread_interrupt
> ->dwc3_gadget_disconnect_interrupt
> ->dwc3_ep0_reset_state
> ->dwc3_ep0_end_control_data
> ->dwc3_send_gadget_ep_cmd
>
> In the issue scenario, in Exynos platforms, we observed that control
> transfers for the previous connect have not yet been completed and end
> transfer command sent as a part of the disconnect sequence and
> processing of USB_ENDPOINT_HALT feature request from the host timeout.
> This maybe an expected scenario since the controller is processing EP
> commands sent as a part of the previous connect. It maybe better to
> remove WARN_ON in all places where device endpoint commands are sent to
> avoid unnecessary kernel panic due to warn.
>
> Cc: stable@...r.kernel.org
> Co-developed-by: Akash M <akash.m5@...sung.com>
> Signed-off-by: Akash M <akash.m5@...sung.com>
> Signed-off-by: Selvarasu Ganesan <selvarasu.g@...sung.com>
> Acked-by: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
> ---
>
> Changes in v3:
> - Added Co-developed-by tags to reflect the correct authorship.
> - And Added Acked-by tag as well.
> Link to v2: https://lore.kernel.org/all/20250807014639.1596-1-selvarasu.g@samsung.com/
>
> Changes in v2:
> - Removed the 'Fixes' tag from the commit message, as this patch does
> not contain a fix.
> - And Retained the 'stable' tag, as these changes are intended to be
> applied across all stable kernels.
> - Additionally, replaced 'dev_warn*' with 'dev_err*'."
> Link to v1: https://lore.kernel.org/all/20250807005638.thhsgjn73aaov2af@synopsys.com/
> ---
> drivers/usb/dwc3/ep0.c | 20 ++++++++++++++++----
> drivers/usb/dwc3/gadget.c | 10 ++++++++--
> 2 files changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
> index 666ac432f52d..b4229aa13f37 100644
> --- a/drivers/usb/dwc3/ep0.c
> +++ b/drivers/usb/dwc3/ep0.c
> @@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
> dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8,
> DWC3_TRBCTL_CONTROL_SETUP, false);
> ret = dwc3_ep0_start_trans(dep);
> - WARN_ON(ret < 0);
> + if (ret < 0)
> + dev_err(dwc->dev, "ep0 out start transfer failed: %d\n", ret);
> +
If this fails, why aren't you returning the error and handling it
properly? Just throwing an error message feels like it's not going to
do much overall.
> for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
> struct dwc3_ep *dwc3_ep;
>
> @@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc,
> ret = dwc3_ep0_start_trans(dep);
> }
>
> - WARN_ON(ret < 0);
> + if (ret < 0)
> + dev_err(dwc->dev,
> + "ep0 data phase start transfer failed: %d\n", ret);
Same here, why not return the error and propagate it up the call stack?
> }
>
> static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
> @@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
>
> static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep)
> {
> - WARN_ON(dwc3_ep0_start_control_status(dep));
> + int ret;
> +
> + ret = dwc3_ep0_start_control_status(dep);
> + if (ret)
> + dev_err(dwc->dev,
> + "ep0 status phase start transfer failed: %d\n", ret);
Same here. Don't "swallow" errors that you find, that's a sure way to
paper over real problems.
Same for all other changes here.
thanks,
greg k-h
Powered by blists - more mailing lists