[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <22969fbd-c16b-9443-7673-1e0ae72c873f@synopsys.com> (raw)
Date: Tue, 10 Aug 2021 11:12:16 +0800
From: Ray Chi <raychi@...gle.com>
To: thinh.nguyen@...opsys.com, Wesley Cheng <wcheng@...eaurora.org>,
"balbi@...nel.org" <balbi@...nel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
John Stultz <john.stultz@...aro.org>
Cc: jackp@...eaurora.org, linux-kernel@...r.kernel.org,
linux-usb@...r.kernel.org, albertccwang@...gle.com,
Thinh Nguyen <Thinh.Nguyen@...opsys.com>
Subject: Re: [PATCH] usb: dwc3: gadget: Use list_replace_init() before
traversing lists
From: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
> + John Stultz
>
> Wesley Cheng wrote:
> > The list_for_each_entry_safe() macro saves the current item (n) and
> > the item after (n+1), so that n can be safely removed without
> > corrupting the list. However, when traversing the list and removing
> > items using gadget giveback, the DWC3 lock is briefly released,
> > allowing other routines to execute. There is a situation where, while
> > items are being removed from the cancelled_list using
> > dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
> > routine is running in parallel (due to UDC unbind). As the cleanup
> > routine removes n, and the pullup disable removes n+1, once the
> > cleanup retakes the DWC3 lock, it references a request who was already
> > removed/handled. With list debug enabled, this leads to a panic.
> > Ensure all instances of the macro are replaced where gadget giveback
> > is used.
> >
> > Example call stack:
> >
> > Thread#1:
> > __dwc3_gadget_ep_set_halt() - CLEAR HALT
> > -> dwc3_gadget_ep_cleanup_cancelled_requests()
> > ->list_for_each_entry_safe()
> > ->dwc3_gadget_giveback(n)
> > ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
> > ->spin_unlock
> > ->Thread#2 executes
> > ...
> > ->dwc3_gadget_giveback(n+1)
> > ->Already removed!
> >
> > Thread#2:
> > dwc3_gadget_pullup()
> > ->waiting for dwc3 spin_lock
> > ...
> > ->Thread#1 released lock
> > ->dwc3_stop_active_transfers()
> > ->dwc3_remove_requests()
> > ->fetches n+1 item from cancelled_list (n removed by Thread#1)
> > ->dwc3_gadget_giveback()
> > ->dwc3_gadget_del_and_unmap_request()- n+1
> > deleted[cancelled_list]
> > ->spin_unlock
> >
> > Fix this condition by utilizing list_replace_init(), and traversing
> > through a local copy of the current elements in the endpoint lists.
> > This will also set the parent list as empty, so if another thread is
> > also looping through the list, it will be empty on the next iteration.
> >
> > Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
> > Signed-off-by: Wesley Cheng <wcheng@...eaurora.org>
> >
> > ---
> > Previous patchset:
> > https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/1620716636-12422-1-git-send-email-wcheng@codeaurora.org/__;!!A4F2R9G_pg!Ngid3pREhM1FWiRmEnCGrN6FhBvSxDTkPbZ4RzAEO5Ubs0aGSxtikFT1APzTWhgw42As$
> > ---
> > drivers/usb/dwc3/gadget.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index a29a4ca..3ce6ed9 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1926,9 +1926,13 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > {
> > struct dwc3_request *req;
> > struct dwc3_request *tmp;
> > + struct list_head local;
> > struct dwc3 *dwc = dep->dwc;
> >
> > - list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
> > +restart:
> > + list_replace_init(&dep->cancelled_list, &local);
> > +
> > + list_for_each_entry_safe(req, tmp, &local, list) {
> > dwc3_gadget_ep_skip_trbs(dep, req);
> > switch (req->status) {
> > case DWC3_REQUEST_STATUS_DISCONNECTED:
> > @@ -1946,6 +1950,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > break;
> > }
> > }
> > +
> > + if (!list_empty(&dep->cancelled_list))
> > + goto restart;
> > }
> >
> > static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> > @@ -3190,8 +3197,12 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> > {
> > struct dwc3_request *req;
> > struct dwc3_request *tmp;
> > + struct list_head local;
> >
> > - list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > +restart:
> > + list_replace_init(&dep->started_list, &local);
> > +
> > + list_for_each_entry_safe(req, tmp, &local, list) {
> > int ret;
> >
> > ret = dwc3_gadget_ep_cleanup_completed_request(dep, event,
> > @@ -3199,6 +3210,9 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> > if (ret)
> > break;
I also met the connection issue. The problem is related that dwc3 requests
in local list are ignored due to loop break.
> > }
> > +
> > + if (!list_empty(&dep->started_list))
> > + goto restart;
>
> This is not right. We don't cleanup the entire started list here.
> Sometime we end early because some TRBs are completed but not all.
Yes, I also think it can be replaced with checking local list and
restoring unhandled requests directly.
> BR,
> Thinh
>
Best regards,
Ray
Powered by blists - more mailing lists