[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<TYUPR06MB62171A3584A2769CBC752322D282A@TYUPR06MB6217.apcprd06.prod.outlook.com>
Date: Fri, 9 Jan 2026 10:26:22 +0000
From: 胡连勤 <hulianqin@...o.com>
To: Mathias Nyman <mathias.nyman@...ux.intel.com>, Mathias Nyman
<mathias.nyman@...el.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"broonie@...nel.org" <broonie@...nel.org>, "quic_wcheng@...cinc.com"
<quic_wcheng@...cinc.com>
CC: "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject:
答复: 答复: Consultation on the issue of digital headphones freezing
Hi Mathias:
> >>>> [192165.107937][ C0] xhci-hcd xhci-hcd.3.auto: Error: Failed finding new dequeue state
> >>>> [192165.107946][ C0] xhci-hcd xhci-hcd.3.auto: Failed to clear cancelled cached URB 000000002d756eab, mark clear anyway
> >>>> [192165.108387][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd --> 1st reset
> >>>> [192165.225904][T17454] usb 1-1: device descriptor read/64, error -71
> >>>> [192165.442224][T17454] usb 1-1: Device not responding to setup address.
> >>>> [192165.642107][T17454] usb 1-1: Device not responding to setup address.
> >>>> [192165.845879][T17454] usb 1-1: device not accepting address 2, error -71
> >>>> [192165.846031][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
> >>>> [192165.957927][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd --> 2st reset
> >>>> [192165.958032][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup context command completion code 0x11.
> >>>> [192165.958040][T17454] usb 1-1: hub failed to enable device, error -22
> >>>> [192165.958165][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
> >>>> [192166.070623][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd --> 3st reset
> >>>> [192166.070728][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup address command completion code 0x11.
> >>>> [192166.273835][T17454] xhci-hcd xhci-hcd.3.auto: ERROR: unexpected setup address command completion code 0x11.
> >>>> [192166.473788][T17454] usb 1-1: device not accepting address 2, error -22
> >>>> [192166.473943][T17454] usb 1-1: WARN: invalid context state for evaluate context command.
> >>>> [192166.585802][T17454] usb 1-1: reset full-speed USB device number 2 using xhci-hcd --> 4st reset
> >
> > Upon checking the exception log again, four device reset operations were observed.
> > Analyzing the code flow, after each of the four resets, the `xhci_free_endpoint_ring` function is called,
> > setting `ep->ring` to null.
> >
> > The specific call is as follows:
> >> usb_reset_device
> >> ->usb_reset_and_verify_device
> >> ->->usb_hcd_alloc_bandwidth
> >> ->->->hcd->driver->check_bandwidth(hcd, udev)
> >> ->->->->xhci_check_bandwidth
> >> ->->->->->xhci_free_endpoint_ring
> >
> > void xhci_free_endpoint_ring(struct xhci_hcd *xhci,
> > struct xhci_virt_device *virt_dev,
> > unsigned int ep_index)
> > {
> > xhci_ring_free(xhci, virt_dev->eps[ep_index].ring);
> > virt_dev->eps[ep_index].ring = NULL; ---> ep->rings = NULL
> > }
> >
> > static int usb_reset_and_verify_device(struct usb_device *udev)
> > {
> > ...
> > -> The device will undergo four reset operations here.
> > for (i = 0; i < PORT_INIT_TRIES; ++i) {
> > if (hub_port_stop_enumerate(parent_hub, port1, i)) {
> > ret = -ENODEV;
> > break;
> > }
> >
> > /* ep0 maxpacket size may change; let the HCD know about it.
> > * Other endpoints will be handled by re-enumeration. */
> > usb_ep0_reinit(udev);
> > ret = hub_port_init(parent_hub, udev, port1, i, &descriptor); ----> reset devices
> > if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV)
> > break;
> > }
> > ...
> > }
>
> Ok, looks like we don't call xhci_sideband_notify_ep_ring_free() in all places we should.
>
> That being said, to solve this crash we need to make xhci_sideband_remove_endpoint() work
> in cases where device is being reset or disconnected, and endpoint is being dropped.
>
> Looks like xhci_initialize_ring_info(ep->ring) was added to xhci_sideband_remove_endpoint()
> in v8 of the original patch series due to a comment about leaving the endpoint in a messy state.
>
> https://lore.kernel.org/linux-usb/20231011002146.1821-1-
> quic_wcheng%40quicinc.com%2F&data=05%7C02%7Chulianqin%40vivo.com%7C683997639e6e446a330408de4ebdc2d3%7C923e42dc4
> 8d54cbeb5821a797a6412ed%7C0%7C0%7C639034774157625174%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiI
> wLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=OS0feNyVaHoHa7TEfXLL%2F42Lvju
> Xt6WOkJTnkMspM7o%3D&reserved=0
>
> Calling xhci_initialize_ring_info() does not really clean up anything, it just sets the
> software dequeue and enqueue pointers to the beginning of the ring. The xHC hardware
> dequeue pointer is still untouched, and the whole ring buffer is still uncleared.
>
> I think we can drop xhci_initialize_ring_info() call completely here.
> The class driver still needs to set up and reinit the endpoint properly if it wants to continue
> using it after it is removed from sideband usage.
>
> We do want to make sure endpoint is stopped when removing it, but take into account that endpoint
> might be stopped, disabled or already dropped by then.
>
> Does the code below solve your crash?
>
> diff --git a/drivers/usb/host/xhci-sideband.c b/drivers/usb/host/xhci-sideband.c
> index a85f62a73313..2bd77255032b 100644
> --- a/drivers/usb/host/xhci-sideband.c
> +++ b/drivers/usb/host/xhci-sideband.c
> @@ -210,7 +210,6 @@ xhci_sideband_remove_endpoint(struct xhci_sideband *sb,
> return -ENODEV;
>
> __xhci_sideband_remove_endpoint(sb, ep);
> - xhci_initialize_ring_info(ep->ring);
>
> return 0;
> }
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index a148a1280126..4161c8c7721d 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -2891,16 +2891,25 @@ int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, int
> gfp_t gfp_flags)
> {
> struct xhci_command *command;
> + struct xhci_ep_ctx *ep_ctx;
> unsigned long flags;
> - int ret;
> + int ret = -ENODEV;
>
> command = xhci_alloc_command(xhci, true, gfp_flags);
> if (!command)
> return -ENOMEM;
>
> spin_lock_irqsave(&xhci->lock, flags);
> - ret = xhci_queue_stop_endpoint(xhci, command, ep->vdev->slot_id,
> - ep->ep_index, suspend);
> +
> + /* make sure endpoint exists and is running before stopping it */
> + if (ep->ring) {
> + ep_ctx = xhci_get_ep_ctx(xhci, ep->vdev->out_ctx, ep->ep_index);
> + if (GET_EP_CTX_STATE(ep_ctx) == EP_STATE_RUNNING)
> + ret = xhci_queue_stop_endpoint(xhci, command,
> + ep->vdev->slot_id,
> + ep->ep_index, suspend);
> + }
> +
> if (ret < 0) {
> spin_unlock_irqrestore(&xhci->lock, flags);
> goto out
This patch looks like good.
I believe this can solve the crash problem I'm currently experiencing.
I'll do a stress test and see.
Thanks
Lianqin
Powered by blists - more mailing lists