[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210628141027.GA656159@rowland.harvard.edu>
Date: Mon, 28 Jun 2021 10:10:27 -0400
From: Alan Stern <stern@...land.harvard.edu>
To: linyyuan@...eaurora.org
Cc: Felipe Balbi <balbi@...nel.org>,
Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
Jack Pham <jackp@...eaurora.org>
Subject: Re: [PATCH] usb: dwc3: fix race of usb_gadget_driver operation
On Mon, Jun 28, 2021 at 05:36:22PM +0800, linyyuan@...eaurora.org wrote:
> On 2021-06-27 22:09, Alan Stern wrote:
> >
> > CPU0 CPU1
> > ---- ----
> > usb_gadget_remove_driver runs
> > Calls synchronize_irq
> > synchronize_irq returns
> > Calls udc_driver_unbind
> > IRQ happens for disconnect
> > Handler unlocks dwc->lock
> > Calls dwc->gadget_driver->disconnect
> > Gadget driver has already been unbound
> > and is not prepared to handle a
> > callback, so it crashes
> > Calls usb_gadget_udc_stop
> > dwc->gadget_driver is
> > set to NULL
> >
> > Without the async_callbacks mechanism, the gadget driver can get a
> > callback at the wrong time (after it has been unbound), which might
> > cause it to crash.
> 1. do you think we need to back to my original patch,
> https://lore.kernel.org/linux-usb/20210619154309.52127-1-linyyuan@codeaurora.org/T/#t
No, I think the async_callbacks approach is slightly better.
> i think we can add the spin lock or mutex lock to protect this kind of race
> from UDC layer, it will be easy understanding.
We don't actually need a lock or mutex to fix this problem. We just
need to make the remove_driver sequence issue two calls to the UDC
driver rather than just one: async_callbacks and udc_stop.
> 2. if you insist this kind of change, how to change following code in dwc3 ?
> if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {
>
> 2.1 if (dwc->async_callbacks && dwc->gadget_driver->disconnect) {
> or
> 2.2 if (dwc->async_callbacks && vdwc->gadget_driver &&
> dwc->gadget_driver->disconnect) {
Either one would be okay. If async_callbacks is enabled then
dwc->gadget_driver should never be NULL, but it won't hurt to check.
After all, disconnect does not occur often and it doesn't need to run
extremely quickly.
Alan Stern
Powered by blists - more mailing lists