linux-kernel - Re: 答复: Consultation on the issue of digital headphones freezing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1e908f2-8dfa-4d1e-8059-5e421145d154@linux.intel.com>
Date: Thu, 8 Jan 2026 15:56:39 +0200
From: Mathias Nyman <mathias.nyman@...ux.intel.com>
To: 胡连勤 <hulianqin@...o.com>,
 Mathias Nyman <mathias.nyman@...el.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 "broonie@...nel.org" <broonie@...nel.org>,
 "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>
Cc: "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: 答复: Consultation on the issue of digital headphones freezing

Hi Lianqin

On 1/7/26 12:08, 胡连勤 wrote:
> Hi  Mathias:
> 
>>>> [192165.107937][    C0] xhci-hcd xhci-hcd.3.auto: Error: Failed finding new dequeue state
>>>> [192165.107946][    C0] xhci-hcd xhci-hcd.3.auto: Failed to clear cancelled cached URB 000000002d756eab, mark clear anyway
>>>> [192165.108387][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd              --> 1st  reset
>>>> [192165.225904][T17454] usb 1-1: device descriptor read/64, error -71
>>>> [192165.442224][T17454] usb 1-1: Device not responding to setup address.
>>>> [192165.642107][T17454] usb 1-1: Device not responding to setup address.
>>>> [192165.845879][T17454] usb 1-1: device not accepting address 2, error -71
>>>> [192165.846031][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
>>>> [192165.957927][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd             --> 2st reset
>>>> [192165.958032][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup context command completion code 0x11.
>>>> [192165.958040][T17454] usb 1-1: hub failed to enable device, error -22
>>>> [192165.958165][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
>>>> [192166.070623][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd            --> 3st reset
>>>> [192166.070728][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup address command completion code 0x11.
>>>> [192166.273835][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup address command completion code 0x11.
>>>> [192166.473788][T17454] usb 1-1: device not accepting address 2, error -22
>>>> [192166.473943][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
>>>> [192166.585802][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd           -->  4st reset
> 
> Upon checking the exception log again, four device reset operations were observed.
> Analyzing the code flow, after each of the four resets, the `xhci_free_endpoint_ring` function is called,
> setting `ep->ring` to null.
> 
> The specific call is as follows:
>> usb_reset_device
>> ->usb_reset_and_verify_device
>> ->->usb_hcd_alloc_bandwidth
>> ->->->hcd->driver->check_bandwidth(hcd, udev)
>> ->->->->xhci_check_bandwidth
>> ->->->->->xhci_free_endpoint_ring
> 
> void xhci_free_endpoint_ring(struct xhci_hcd *xhci,
> 		struct xhci_virt_device *virt_dev,
> 		unsigned int ep_index)
> {
> 	xhci_ring_free(xhci, virt_dev->eps[ep_index].ring);
> 	virt_dev->eps[ep_index].ring = NULL;     ---> ep->rings = NULL
> }
> 
> static int usb_reset_and_verify_device(struct usb_device *udev)
> {
> ...
> 	-> The device will undergo four reset operations here.
> 	for (i = 0; i < PORT_INIT_TRIES; ++i) {
> 		if (hub_port_stop_enumerate(parent_hub, port1, i)) {
> 			ret = -ENODEV;
> 			break;
> 		}
> 
> 		/* ep0 maxpacket size may change; let the HCD know about it.
> 		 * Other endpoints will be handled by re-enumeration. */
> 		usb_ep0_reinit(udev);
> 		ret = hub_port_init(parent_hub, udev, port1, i, &descriptor);     ----> reset devices
> 		if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV)
> 			break;
> 	}
> ...
> }

Ok, looks like we don't call xhci_sideband_notify_ep_ring_free() in all places we should.

That being said, to solve this crash we need to make xhci_sideband_remove_endpoint() work
in cases where device is being reset or disconnected, and endpoint is being dropped.

Looks like xhci_initialize_ring_info(ep->ring) was added to xhci_sideband_remove_endpoint()
in v8 of the original patch series due to a comment about leaving the endpoint in a messy state.

https://lore.kernel.org/linux-usb/20231011002146.1821-1-quic_wcheng@quicinc.com/

Calling  xhci_initialize_ring_info() does not really clean up anything, it just sets the
software dequeue and enqueue pointers to the beginning of the ring. The xHC hardware
dequeue pointer is still untouched, and the whole ring buffer is still uncleared.

I think we can drop xhci_initialize_ring_info() call completely here.
The class driver still needs to set up and reinit the endpoint properly if it wants to continue
using it after it is removed from sideband usage.

We do want to make sure endpoint is stopped when removing it, but take into account that endpoint
might be stopped, disabled or already dropped by then.

Does the code below solve your crash?

diff --git a/drivers/usb/host/xhci-sideband.c b/drivers/usb/host/xhci-sideband.c
index a85f62a73313..2bd77255032b 100644
--- a/drivers/usb/host/xhci-sideband.c
+++ b/drivers/usb/host/xhci-sideband.c
@@ -210,7 +210,6 @@ xhci_sideband_remove_endpoint(struct xhci_sideband *sb,
  		return -ENODEV;
  
  	__xhci_sideband_remove_endpoint(sb, ep);
-	xhci_initialize_ring_info(ep->ring);
  
  	return 0;
  }
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index a148a1280126..4161c8c7721d 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -2891,16 +2891,25 @@ int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, int
  			    gfp_t gfp_flags)
  {
  	struct xhci_command *command;
+	struct xhci_ep_ctx *ep_ctx;
  	unsigned long flags;
-	int ret;
+	int ret = -ENODEV;
  
  	command = xhci_alloc_command(xhci, true, gfp_flags);
  	if (!command)
  		return -ENOMEM;
  
  	spin_lock_irqsave(&xhci->lock, flags);
-	ret = xhci_queue_stop_endpoint(xhci, command, ep->vdev->slot_id,
-				       ep->ep_index, suspend);
+
+	/* make sure endpoint exists and is running before stopping it */
+	if (ep->ring) {
+		ep_ctx = xhci_get_ep_ctx(xhci, ep->vdev->out_ctx, ep->ep_index);
+		if (GET_EP_CTX_STATE(ep_ctx) == EP_STATE_RUNNING)
+			ret = xhci_queue_stop_endpoint(xhci, command,
+						       ep->vdev->slot_id,
+						       ep->ep_index, suspend);
+	}
+
  	if (ret < 0) {
  		spin_unlock_irqrestore(&xhci->lock, flags);
  		goto out