linux-kernel - Re: [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame interval length and buffersize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dcad0089-4105-44bc-a2b4-3cfc6f44164b@google.com>
Date: Mon, 22 Apr 2024 17:21:09 -0700
From: Avichal Rakesh <arakesh@...gle.com>
To: Michael Grzeschik <mgr@...gutronix.de>,
 Laurent Pinchart <laurent.pinchart@...asonboard.com>,
 Daniel Scally <dan.scally@...asonboard.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Jayant Chowdhary <jchowdhary@...gle.com>, etalvala@...gle.com
Cc: linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame
 interval length and buffersize

On 4/21/24 16:25, Michael Grzeschik wrote:
> On Tue, Apr 09, 2024 at 11:24:56PM +0200, Michael Grzeschik wrote:
>> This patch series is improving the size calculation and allocation
>> of the uvc requests. Using the currenlty setup frame duration of the
>> stream it is possible to calculate the number of requests based on the
>> interval length.
> 
> The basic concept here is right. But unfortunatly we found out that
> together with Patch [1] and the current zero length request pump
> mechanism [2] and [3] this is not working as expected.
> 
> The conclusion that we can not queue more than one frame at once into
> the hw led to [1]. The current implementation of zero length reqeusts
> which will be queued while we are waiting for the frame to finish
> transferring will enlarge the frame duration. Since every zero-length
> request is still taking up at least one frame interval of 125 us.

I haven't taken a super close look at your patches, so please feel free
to correct me if I am misunderstanding something.

It looks like the goal of the patches is to determine a better number
and size of usb_requests from the given framerate such that we send exactly
nreqs requests per frame where nreqs is determined to be the exact number 
of requests that can be sent in one frame interval?

As the logic stands, we need some 0-length requests to be circulating to
ensure that we don't miss ISOC deadlines. The current logic unconditionally
sends half of all allocated requests to be circulated.

With those two things in mind, this means than video_pump can at encode
at most half a frame in one go, and then has to wait for complete 
callbacks to come in. In such cases, the theoretical worst case for 
encode time is  
125us * (number of requests needed per frame / 2) + scheduling delays
as after the first half of the frame has been encoded, the video_pump
thread will have to wait 125us for each of the zero length requests to
be returned.

The underlying assumption behind the "queue 0-length requests" approach
was that video_pump encodes the frames in as few requests as possible
and that there are spare requests to maintain a pressure on the 
ISOC queue without hindering the video_pump thread, and unfortunately
it seems like patch 3/3 is breaking both of them?

Assuming my understanding of your patches is correct, my question 
is: Why do we want to spread the frame uniformly over the requests
instead of encoding it in as few requests as possible. Spreading
the frame over more requests artificially increases the encode time
required by video_pump, and AFAICT there is no real benefit to it?

> Therefor to properly make those patches work, we will have to get rid of
> the zero length pump mechanism again and make sure that the whole
> business logic of what to be queued and when will only be done in the
> pump worker. It is possible to let the dwc3 udc run dry, as we are
> actively waiting for the frame to finish, the last request in the
> prepared and started list will stop the current dwc3 stream and therf> no underruns will occur with the next ep_queue.

One thing to note here: The reason we moved to queuing 0-length requests
from complete callback was because even with realtime priority, video_pump
thread doesn't always meet the ISOC queueing cadence. I think stopping and
starting the stream was briefly discussed in our initial discussion in 
https://lore.kernel.org/all/20230419001143.pdxflhzyecf4kvee@synopsys.com/
and Thinh mentioned that dwc3 controller does it if it detects an underrun,
but I am not sure if starting and stopping an ISOC stream is good practice.
Someone better versed in USB protocol can probably confirm, but it seems
somewhat hacky to stop the ISOC stream at the end of the frame and restart
with the next frame. 

> With all these pending patches the whole uvc saga of underruns and
> flickering videostreams should come to an end™.

This would indeed be nice!

> 
> I already started with this but would be happy to see Avichal and others
> to review the patches when they are ready in my eyes.

Of course!

- Avi.