[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250303113401.280cb911@foxbook>
Date: Mon, 3 Mar 2025 11:34:01 +0100
From: MichaĆ Pecio <michal.pecio@...il.com>
To: Mathias Nyman <mathias.nyman@...ux.intel.com>
Cc: ki.chiang65@...il.com, <gregkh@...uxfoundation.org>,
<linux-usb@...r.kernel.org>, linux-kernel@...r.kernel.org
Subject: Re: [RFT PATCH] xhci: Handle spurious events on Etron host isoc
enpoints
On Fri, 28 Feb 2025 18:18:24 +0200, Mathias Nyman wrote:
> Unplugging a USB3.0 webcam from Etron hosts while streaming results
> in errors like this:
>
> [ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr
> not part of current TD ep_index 18 comp_code 13 [ 2.646446] xhci_hcd
> 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start
> 000000002fdf8640 trb-end 000000002fdf8650 [ 2.646560] xhci_hcd
> 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
> ep_index 18 comp_code 13 [ 2.646568] xhci_hcd 0000:03:00.0: Looking
> for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end
> 000000002fdf8670
>
> Etron xHC generates two transfer events for the TRB if an error is
> detected while processing the last TRB of an isoc TD.
>
> The first event can be any sort of error (like USB Transaction or
> Babble Detected, etc), and the final event is Success.
>
> The xHCI driver will handle the TD after the first event and remove it
> from its internal list, and then print an "Transfer event TRB DMA ptr
> not part of current TD" error message after the final event.
>
> Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a
> transaction error mid TD.") is designed to address isoc transaction
> errors, but unfortunately it doesn't account for this scenario.
>
> This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a
> success event follows a 'short transfer' event, but the TD the event
> points to is already given back.
>
> Expand the spurious success 'short transfer' event handling to cover
> the spurious success after error on Etron hosts.
>
> Kuangyi Chiang reported this issue and submitted a different solution
> based on using error_mid_td. This commit message is mostly taken
> from that patch.
>
> Reported-by: Kuangyi Chiang <ki.chiang65@...il.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@...ux.intel.com>
Works here too, modulo the obvious build problem.
Etron with errors:
[ 1064.865311] xhci_hcd 0000:06:00.0: Transfer error for slot 1 ep 2 on endpoint
[ 1064.865322] xhci_hcd 0000:06:00.0: Successful completion on short TX for slot 1 ep 2 with last td comp code 4
[ 1064.865326] xhci_hcd 0000:06:00.0: Spurious event dma 0x00000000ffef88c0, comp_code 13 after 4
Renesas with short packets:
[ 1365.299218] xhci_hcd 0000:08:00.0: Successful completion on short TX for slot 1 ep 2 with last td comp code 13
[ 1365.299223] xhci_hcd 0000:08:00.0: Spurious event dma 0x00000000ffbbf870, comp_code 13 after 13
BTW, it says "comp_code 13 after something" because of this crazy
TRUST_TX_LENGTH hack, which changes trb_comp_code if it's success
but the residual is nonzero. If I remove the hack,
Etron:
[ 2713.630443] xhci_hcd 0000:06:00.0: Spurious event dma 0x00000000ff3b6130, comp_code 1 after 4
Renesas:
[ 4033.652300] xhci_hcd 0000:08:00.0: Spurious event dma 0x00000000ffcd1b80, comp_code 1 after 13
The hack could almost be removed now, but if there really are HCs
which report Success on the first event, this won't work for them:
> +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci,
> + struct xhci_ring *ring)
> +{
> + switch (ring->old_trb_comp_code) {
> + case COMP_SHORT_PACKET:
> + return xhci->quirks & XHCI_SPURIOUS_SUCCESS;
Could it work without relying on fictional COMP_SHORT_PACKET events?
> + if (xhci_spurious_success_tx_event(xhci, ep_ring)) {
> + xhci_dbg(xhci, "Spurious event dma %pad, comp_code %u after %u\n",
> + &ep_trb_dma, trb_comp_code, ep_ring->old_trb_comp_code);
> + ep_ring->old_trb_comp_code = trb_comp_code;
This part will (quite arbitrarily IMO) not execute if td_list is empty.
I had this idea that "empty td_list" and "no matching TD on td_list"
are practically identical cases, and their code could be merged.
Powered by blists - more mailing lists