[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f2fa6738-29f1-3434-70f2-7fba0b1b2567@raspberrypi.com>
Date: Wed, 1 Sep 2021 14:15:14 +0100
From: Phil Elwell <phil@...pberrypi.com>
To: Mathias Nyman <mathias.nyman@...ux.intel.com>,
Mathias Nyman <mathias.nyman@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jonathan Bell <jonathan@...pberrypi.com>,
linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] xhci: guard accesses to ep_state in xhci_endpoint_reset()
Hi Mathias,
On 01/09/2021 10:21, Mathias Nyman wrote:
> On 31.8.2021 19.02, Phil Elwell wrote:
>> From: Jonathan Bell <jonathan@...pberrypi.com>
>>
>> See https://github.com/raspberrypi/linux/issues/3981
>
> Thanks, so in a nutshell the issue looks something like:
>
> [827586.220071] xhci_hcd 0000:01:00.0: WARN Cannot submit Set TR Deq Ptr
> [827586.220087] xhci_hcd 0000:01:00.0: A Set TR Deq Ptr command is pending.
> [827723.160680] INFO: task usb-storage:93 blocked for more than 122 seconds.
>
> The blocked task is probably because xhci driver failed to give back the
> URB after failing to submit a "Set TR Deq Ptr" command. This part should
> be fixed in:
> https://lore.kernel.org/r/20210820123503.2605901-4-mathias.nyman@linux.intel.com
> which is currently in usb-next, and should be in 5.15-rc1 and future 5.12+ stable.
>
>>
>> Two read-modify-write cycles on ep->ep_state are not guarded by
>> xhci->lock. Fix these.
>>
>
> This is probably one cause for the "Warn Cannot submit Set TR Deq Ptr A Set TR
> Deq Ptr command is pending" message.
> Another possibility is that with UAS and streams we have several transfer rings
> per endpoint, meaning that if two TDs on separate stream rings on the same
> endpoint both stall, or are cancelled we could see this message.
>
> The SET_DEQ_PENDING flag in ep->ep_state should probably be per ring, not per
> endpoint. Then we also need a "rings_with_pending_set_deq" counter per endpoint
> to keep track when all set_tr_deq commands complete, and we can restart the endpoint
Jonathan, the author of the patch, may give some detailed feedback on these
statements when he has a moment - "Well, sort of... it's complicated" was the
summary.
> Anyway, my patch linked above together with this patch should make these errors
> a lot more harmless.
Yes, I think that's true. We have a downstream patch to warn about a pending
Set TR Deq Ptr command but proceed anyway, allowing systems to recover, but with
the additional spin lock usage our users are reporting no failures _and_ no
warnings.
> Let me know if you can trigger the issue with both these patches applied.
We've not tried your patch yet.
> I'll add your patch to the queue as well.
Many thanks,
Phil
Powered by blists - more mailing lists