lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Oct 2022 03:46:30 -0500
From:   Jeff Vanhoof <jdv1029@...il.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     linux-usb@...r.kernel.org,
        Daniel Scally <dan.scally@...asonboard.com>,
        Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        Jonathan Corbet <corbet@....net>,
        Laurent Pinchart <laurent.pinchart@...asonboard.com>,
        Felipe Balbi <balbi@...nel.org>,
        Paul Elder <paul.elder@...asonboard.com>,
        Michael Grzeschik <m.grzeschik@...gutronix.de>,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: uvc gadget performance issues with skip interrupt impl

Hi Greg,

On Tue, Oct 25, 2022 at 10:04:19AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Oct 25, 2022 at 01:34:01AM -0500, Jeff Vanhoof wrote:
> > Hi,
> > 
> > During the queuing up of requests from the UVC Gadget Driver to DWC3 for one
> > frame, if a missed isoc event occurs then it is possible for the next
> > consecutive frame(s) to also see missed isoc related errors as a result,
> > presenting to the user as a large video stall.
> > 
> > This issue appears to have come in with the skip interrupt implementation in
> > the UVC Gadget Driver:
> > 
> > usb: gadget: uvc: decrease the interrupt load to a quarter
> > https://lore.kernel.org/r/20210628155311.16762-6-m.grzeschik@pengutronix.de
> > 
> > Below is an example flow of how the issue can occur (and why).
> > 
> > For example (ISOC use case):
> > 1) DWC3 driver has 4 requests queued up from the UVC Gadget Driver.
> > 
> > 2) First request has IOC bit set due to no_interrupt=0 also being set, and IMI
> > bit is set to detect missed ISOC.
> > 
> > 3) Requests 2,3,4 do not have IOC bit set due to no_interrupt=1 being set for
> > them. (Note: Whether or not the IMI bit is set for these requests does not
> > matter, issue can still crop up as there is no guarantee that request 2,3,4
> > will see a missed isoc event)
> > 
> > 4) First request gets a missed isoc event and DWC3 returns the req and error to
> > UVC Gadget Driver.
> > 
> > 5) UVC Gadget Driver, in uvc_video_complete, proceeds to cancel the queue by
> > calling uvcg_queue_cancel.
> > 
> > 6) UVC Gadget Driver stops sending additional requests for the current frame.
> > 
> > 7) DWC3 will still have requests 2,3,4 queued up and sitting in its
> > started_list as these requests are not given back to the UVC gadget driver
> > because they each have no_interrupt=1 set, and the DWC3 driver will not have
> > any additional interrupts triggered for them as a result.
> > 
> > 8) Approximately 30-100ms later a new frame enters the UVC Gadget Driver (from
> > V4L2), and it proceeds to send additional requests to the DWC3 driver.
> > 
> > 9) Because requests 2,3,4 are still sitting in the started_list of the dwc3
> > driver, the driver does not stop and restart the transmission that normally
> > helps it recover from the missed isoc situation (this usually happens in
> > between frames).
> > 
> > 10) Some of the requests from the new frame will have no_interrupt=0 set, but
> > these requests will be considered missed/late by the DWC3 controller.
> > 
> > 11) Because these new requests have the IOC bit set (and possibly IMI),
> > interrupts will be triggered causing the DWC3 Driver to return the req and
> > error to the UVC Gadget Driver.
> > 
> > 12) And if the last set of requests sent by the UVC Gadget Driver have
> > "no_interrupt=1" set, then DWC3 may not interrupt further until new requests
> > come in, and the cycle of frame drops/errors will continue.
> > 
> > I have briefly mentioned this issue in another conversation with Thinh. At the
> > time he mentioned that 3 things could possibly be done to help resolve this
> > issue:
> > 
> > 1) The UVC Gadget Driver should ensure that the last requests queued to DWC3
> > must always have "no_interrupt=0" set.
> > 
> > 2) DWC3 can detect stale requests, stop the transmission and give back the
> > requests to the UVC Gadget Driver, and restart the transmission for the new set
> > of requests.
> > 
> > 3) Set "no_interrupt=0" for each request.
> >  
> > I have tested out various implementations for all 3 possibilities and they each
> > seem to work ok. Note that these test implementations are not ready for prime
> > time, but served as a way to prove that potential changes in these areas could
> > help to resolve this issue.
> > 
> > I believe that a change for the UVC Gadget Driver should be made, but it also
> > makes sense for the DWC3 driver to also attempt to recover from this situation
> > if possible.
> > 
> > Does anyone have an opinion on the best way to proceed?
> 
> Please see this set of patches and the discussion around them:
> 	https://lore.kernel.org/r/20221018215044.765044-1-w36195@motorola.com
> 
> Some of them are already queued up in my tree and in linux-next, can you
> try that?  There are others for the dwc3 driver on the mailing list as
> well, testing those would be wonderful if you could do that.
> 
> thanks,
> 
> greg k-h

I've been working with the submitter of those patches (Dan) to debug various
crashes and performance issues being seen. I believe that the issue I've
described above is unique and am not aware of any current fixes targetting a
fix for it. This issue will primarily appear for users experiencing frame drops
due to missed isoc issues when the skip interrupt implementation in uvc is
enabled (usb: gadget: uvc: decrease the interrupt load to a quarter).

Thanks,
Jeff



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ