[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240308024653.dp76mzoom4fuoib3@synopsys.com>
Date: Fri, 8 Mar 2024 02:47:02 +0000
From: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To: Michael Grzeschik <mgr@...gutronix.de>
CC: Thinh Nguyen <Thinh.Nguyen@...opsys.com>, Dan Vacura <w36195@...orola.com>,
"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
Daniel Scally <dan.scally@...asonboard.com>,
Jeff Vanhoof <qjv001@...orola.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jonathan Corbet <corbet@....net>,
Laurent Pinchart <laurent.pinchart@...asonboard.com>,
Felipe Balbi <balbi@...nel.org>,
Paul Elder <paul.elder@...asonboard.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of
release after missed isoc
On Thu, Mar 07, 2024, Michael Grzeschik wrote:
> On Thu, Mar 07, 2024 at 01:57:44AM +0000, Thinh Nguyen wrote:
> > On Tue, Feb 27, 2024, Michael Grzeschik wrote:
> > > On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote:
> > > > On Thu, Feb 22, 2024, Michael Grzeschik wrote:
> > > > > For #2: I found an issue in the handling of the completion of requests in
> > > > > the started list. When the interrupt handler is *explicitly* calling
> > > > > stop_active_transfer if the overall event of the request was an missed
> > > > > event. This event value only represents the value of the request that
> > > > > was actually triggering the interrupt.
> > > > >
> > > > > It also calls ep_cleanup_completed_requests and is iterating over the
> > > > > started requests and will call giveback/complete functions of the
> > > > > requests with the proper request status.
> > > > >
> > > > > So this will also catch missed requests in the queue. However, since
> > > > > there might be, lets say 5 good requests and one missed request, what
> > > > > will happen is, that each complete call for the first good requests will
> > > > > enqueue new requests into the started list and will also call the
> > > > > updatecmd on that transfer that was already missed until the loop will
> > > > > reach the one request with the MISSED status bit set.
> > > > >
> > > > > So in my opinion the patch from Jeff makes sense when adding the
> > > > > following change aswell. With those both changes the underruns and
> > > > > broken frames finally disappear. I am still unsure about the complete
> > > > > solution about that, since with this the mentioned 5 good requests
> > > > > will be cancelled aswell. So this is still a WIP status here.
> > > > >
> > > >
> > > > When the dwc3 driver issues stop_active_transfer(), that means that the
> > > > started_list is empty and there is an underrun.
> > >
> > > At this moment this is only the case when both, pending and started list
> > > are empty. Or the interrupt event was EXDEV.
> > >
> > > The main problem is that the function
> > > dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will
> > > issue an complete for each started request, which on the other hand will
> > > refill the pending list, and therefor after that refill the
> > > stop_active_transfer is currently never hit.
> > >
> > > > It treats the incoming requests as staled. However, for UVC, they are
> > > > still "good".
> > >
> > > Right, so in that case we can requeue them anyway. But this will have to
> > > be done after the stop transfer cmd has finished.
> > >
> > > > I think you can just check if the started_list is empty before queuing
> > > > new requests. If it is, perform stop_active_transfer() to reschedule the
> > > > incoming requests. None of the newly queue requests will be released
> > > > yet since they are in the pending_list.
> > >
> > > So that is basically exactly what my patch is doing. However in the case
> > > of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests
> > > as jeff stated. So his underlying patch is really fixing an issue here.
> >
> > What I mean is to actively check for started list on every
> > usb_ep_queue() call. Checking during
> > dwc3_gadget_ep_cleanup_completed_requests() is already too late.
>
> I see.
>
> > >
> > > > For UVC, perhaps you can introduce a new flag to usb_request called
> > > > "ignore_queue_latency" or something equivalent. The dwc3 is already
> > > > partially doing this for UVC. With this new flag, we can rework dwc3 to
> > > > clearly separate the expected behavior from the function driver.
> > >
> > > I don't know why this "extra" flag is even necessary. The code example
> > > is already working without that extra flag.
> >
> > The flag is for controller to determine what kinds of behavior the
> > function driver expects. My intention is if this extra flag is not set,
> > the dwc3 driver will not attempt to reshcedule isoc request at all (ie.
> > no stop_active_transfer()).
>
> Ok.
>
> > >
> > > Actually I even came up with an better solution. Additionally of checking if
> > > one of the requests in the started list was missed, we can activly check if
> > > the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is
> > > going to enqueue in to the empty trb ring.
> > >
> > > So my whole change looks like that:
> > >
> > > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> > > index efe6caf4d0e87..2c8047dcd1612 100644
> > > --- a/drivers/usb/dwc3/core.h
> > > +++ b/drivers/usb/dwc3/core.h
> > > @@ -952,6 +952,7 @@ struct dwc3_request {
> > > #define DWC3_REQUEST_STATUS_DEQUEUED 3
> > > #define DWC3_REQUEST_STATUS_STALLED 4
> > > #define DWC3_REQUEST_STATUS_COMPLETED 5
> > > +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
> > > #define DWC3_REQUEST_STATUS_UNKNOWN -1
> > > u8 epnum;
> > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > index 858fe4c299b7a..a31f4d3502bd3 100644
> > > --- a/drivers/usb/dwc3/gadget.c
> > > +++ b/drivers/usb/dwc3/gadget.c
> > > @@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > > req = next_request(&dep->cancelled_list);
> > > dwc3_gadget_ep_skip_trbs(dep, req);
> > > switch (req->status) {
> > > + case 0:
> > > + dwc3_gadget_giveback(dep, req, 0);
> > > + break;
> > > case DWC3_REQUEST_STATUS_DISCONNECTED:
> > > dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
> > > break;
> > > @@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > > case DWC3_REQUEST_STATUS_STALLED:
> > > dwc3_gadget_giveback(dep, req, -EPIPE);
> > > break;
> > > + case DWC3_REQUEST_STATUS_MISSED_ISOC:
> > > + dwc3_gadget_giveback(dep, req, -EXDEV);
> > > + break;
> > > default:
> > > dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
> > > dwc3_gadget_giveback(dep, req, -ECONNRESET);
> > > @@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
> > > return ret;
> > > }
> > > +static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep)
> > > +{
> > > + struct dwc3_request *req;
> > > + struct dwc3_request *tmp;
> > > + int ret = 0;
> > > +
> > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > > + struct dwc3_trb *trb;
> > > +
> > > + trb = req->trb;
> > > + switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) {
> > > + case DWC3_TRBSTS_MISSED_ISOC:
> > > + /* Isoc endpoint only */
> > > + ret = -EXDEV;
> > > + break;
> > > + case DWC3_TRB_STS_XFER_IN_PROG:
> > > + /* Applicable when End Transfer with ForceRM=0 */
> > > + case DWC3_TRBSTS_SETUP_PENDING:
> > > + /* Control endpoint only */
> > > + case DWC3_TRBSTS_OK:
> > > + default:
> > > + ret = 0;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + return ret;
> > > +}
> > > +
> > > static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> > > const struct dwc3_event_depevt *event, int status)
> > > {
> > > @@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
> > > {
> > > struct dwc3 *dwc = dep->dwc;
> > > bool no_started_trb = true;
> > > + unsigned int transfer_in_flight = 0;
> > > +
> > > + /* It is possible that the interrupt thread was delayed by
> > > + * scheduling in the system, and therefor the HW has already
> > > + * run dry. In that case the last trb in the queue is already
> > > + * handled by the hw. By checking the HWO bit we know to restart
> > > + * the whole transfer. The condition to appear is more likelely
> > > + * if not every trb has the IOC bit set and therefor does not
> > > + * trigger the interrupt thread fewer.
> > > + */
> > > + if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) {
> > > + struct dwc3_trb *trb;
> > > - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> > > + trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
> > > + transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO;
> > > + }
> > > - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> > > - goto out;
> > > + if (status == -EXDEV || !transfer_in_flight) {
> > > + struct dwc3_request *tmp;
> > > + struct dwc3_request *req;
> > > - if (!dep->endpoint.desc)
> > > - return no_started_trb;
> > > + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> > > + dwc3_stop_active_transfer(dep, true, true);
> > > - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > - list_empty(&dep->started_list) &&
> > > - (list_empty(&dep->pending_list) || status == -EXDEV))
>
> @[!!here!!]
>
> > > - dwc3_stop_active_transfer(dep, true, true);
> > > - else if (dwc3_gadget_ep_should_continue(dep))
> > > - if (__dwc3_gadget_kick_transfer(dep) == 0)
> > > - no_started_trb = false;
> > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > > + dwc3_gadget_move_cancelled_request(req,
> > > + (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ?
> > > + DWC3_REQUEST_STATUS_MISSED_ISOC : 0);
> > > + }
> > > + } else {
> > > + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> > > +
> > > + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> > > + goto out;
> > > +
> > > + if (!dep->endpoint.desc)
> > > + return no_started_trb;
> > > +
> > > + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > + list_empty(&dep->started_list) && list_empty(&dep->pending_list))
> > > + dwc3_stop_active_transfer(dep, true, true);
> > > + else if (dwc3_gadget_ep_should_continue(dep))
> > > + if (__dwc3_gadget_kick_transfer(dep) == 0)
> > > + no_started_trb = false;
> > > + }
> > > out:
> > > /*
> > >
> > > I will seperate the whole hunk into smaller changes and send an v1
> > > the next days to review.
> > >
>
> I finally send a v1 of my series.
>
> https://lore.kernel.org/linux-usb/20240307-dwc3-gadget-complete-irq-v1-0-4fe9ac0ba2b7@pengutronix.de/
>
> For the rest of the discussion, I would like to move the conversation to
> the newly send series.
I saw your pushes. Thanks. I'll review and move the discussion there.
>
> > No, we should not reschedule for every missed-isoc. We only want to
> > target underrun condition.
>
> As you stated above, with reschedule what you mean is calling
> stop_transfer after a missed transfer was seen?
>
> If so, why is this condition in there already? (@[!!here!!])
>
It's only to reschedule if started_list is empty _and_ if there's either
no pending request or there's a missed isoc. Not for every missed isoc.
BR,
Thinh
Powered by blists - more mailing lists