lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2025081348-depict-lapel-2e9e@gregkh>
Date: Wed, 13 Aug 2025 16:33:39 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Selvarasu Ganesan <selvarasu.g@...sung.com>
Cc: Thinh.Nguyen@...opsys.com, m.grzeschik@...gutronix.de, balbi@...com,
	bigeasy@...utronix.de, linux-usb@...r.kernel.org,
	linux-kernel@...r.kernel.org, jh0801.jung@...sung.com,
	dh10.jung@...sung.com, akash.m5@...sung.com,
	hongpooh.kim@...sung.com, eomji.oh@...sung.com,
	shijie.cai@...sung.com, alim.akhtar@...sung.com,
	muhammed.ali@...sung.com, thiagu.r@...sung.com,
	stable@...r.kernel.org
Subject: Re: [PATCH v3] usb: dwc3: Remove WARN_ON for device endpoint command
 timeouts

On Fri, Aug 08, 2025 at 06:23:05PM +0530, Selvarasu Ganesan wrote:
> This commit addresses a rarely observed endpoint command timeout
> which causes kernel panic due to warn when 'panic_on_warn' is enabled
> and unnecessary call trace prints when 'panic_on_warn' is disabled.
> It is seen during fast software-controlled connect/disconnect testcases.
> The following is one such endpoint command timeout that we observed:
> 
> 1. Connect
>    =======
> ->dwc3_thread_interrupt
>  ->dwc3_ep0_interrupt
>   ->configfs_composite_setup
>    ->composite_setup
>     ->usb_ep_queue
>      ->dwc3_gadget_ep0_queue
>       ->__dwc3_gadget_ep0_queue
>        ->__dwc3_ep0_do_control_data
>         ->dwc3_send_gadget_ep_cmd
> 
> 2. Disconnect
>    ==========
> ->dwc3_thread_interrupt
>  ->dwc3_gadget_disconnect_interrupt
>   ->dwc3_ep0_reset_state
>    ->dwc3_ep0_end_control_data
>     ->dwc3_send_gadget_ep_cmd
> 
> In the issue scenario, in Exynos platforms, we observed that control
> transfers for the previous connect have not yet been completed and end
> transfer command sent as a part of the disconnect sequence and
> processing of USB_ENDPOINT_HALT feature request from the host timeout.
> This maybe an expected scenario since the controller is processing EP
> commands sent as a part of the previous connect. It maybe better to
> remove WARN_ON in all places where device endpoint commands are sent to
> avoid unnecessary kernel panic due to warn.
> 
> Cc: stable@...r.kernel.org
> Co-developed-by: Akash M <akash.m5@...sung.com>
> Signed-off-by: Akash M <akash.m5@...sung.com>
> Signed-off-by: Selvarasu Ganesan <selvarasu.g@...sung.com>
> Acked-by: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
> ---
> 
> Changes in v3:
> - Added Co-developed-by tags to reflect the correct authorship.
> - And Added Acked-by tag as well.
> Link to v2: https://lore.kernel.org/all/20250807014639.1596-1-selvarasu.g@samsung.com/
> 
> Changes in v2:
> - Removed the 'Fixes' tag from the commit message, as this patch does
>   not contain a fix.
> - And Retained the 'stable' tag, as these changes are intended to be
>   applied across all stable kernels.
> - Additionally, replaced 'dev_warn*' with 'dev_err*'."
> Link to v1: https://lore.kernel.org/all/20250807005638.thhsgjn73aaov2af@synopsys.com/
> ---
>  drivers/usb/dwc3/ep0.c    | 20 ++++++++++++++++----
>  drivers/usb/dwc3/gadget.c | 10 ++++++++--
>  2 files changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
> index 666ac432f52d..b4229aa13f37 100644
> --- a/drivers/usb/dwc3/ep0.c
> +++ b/drivers/usb/dwc3/ep0.c
> @@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
>  	dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8,
>  			DWC3_TRBCTL_CONTROL_SETUP, false);
>  	ret = dwc3_ep0_start_trans(dep);
> -	WARN_ON(ret < 0);
> +	if (ret < 0)
> +		dev_err(dwc->dev, "ep0 out start transfer failed: %d\n", ret);
> +

If this fails, why aren't you returning the error and handling it
properly?  Just throwing an error message feels like it's not going to
do much overall.

>  	for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
>  		struct dwc3_ep *dwc3_ep;
>  
> @@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc,
>  		ret = dwc3_ep0_start_trans(dep);
>  	}
>  
> -	WARN_ON(ret < 0);
> +	if (ret < 0)
> +		dev_err(dwc->dev,
> +			"ep0 data phase start transfer failed: %d\n", ret);

Same here, why not return the error and propagate it up the call stack?

>  }
>  
>  static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
> @@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
>  
>  static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep)
>  {
> -	WARN_ON(dwc3_ep0_start_control_status(dep));
> +	int	ret;
> +
> +	ret = dwc3_ep0_start_control_status(dep);
> +	if (ret)
> +		dev_err(dwc->dev,
> +			"ep0 status phase start transfer failed: %d\n", ret);

Same here.  Don't "swallow" errors that you find, that's a sure way to
paper over real problems.

Same for all other changes here.

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ