linux-kernel - Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame

Open Source and information security mailing list archives

Message-ID: <56B1B093.1020300@rock-chips.com>
Date:	Wed, 03 Feb 2016 15:47:31 +0800
From:	Kever Yang <kever.yang@...k-chips.com>
To:	Doug Anderson <dianders@...omium.org>
CC:	Tao Huang <huangtao@...k-chips.com>,
	Stefan Wahren <stefan.wahren@...e.com>,
	Heiko Stübner <heiko@...ech.de>,
	John Youn <johnyoun@...opsys.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Ming Lei <ming.lei@...onical.com>,
	"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Felipe Balbi <balbi@...com>,
	John Youn <John.Youn@...opsys.com>,
	"open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
	"Kaukab, Yousaf" <yousaf.kaukab@...el.com>,
	Alan Stern <stern@...land.harvard.edu>,
	linux-rpi-kernel@...ts.infradead.org,
	Julius Werner <jwerner@...omium.org>,
	吴良峰 <william.wu@...k-chips.com>,
	"Herrero, Gregory" <gregory.herrero@...el.com>,
	Dinh Nguyen <dinguyen@...nsource.altera.com>
Subject: Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame

Doug,

Thanks for your detail debug information, pls add my Reviewed-by for 
this patch.

Thanks,
- Kever
On 02/03/2016 06:47 AM, Doug Anderson wrote:
> Kever,
>
> On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang@...k-chips.com> wrote:
>> Doug,
>>
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>>> transfer is for an even or an odd frame (or microframe if the controller
>>> is running in high speed mode).
>>>
>>> The controller appears to use this as a simple way to figure out if a
>>> transfer should happen right away (in the current microframe) or should
>>> happen at the start of the next microframe.  Said another way:
>>>
>>> - If you set "odd" and the current frame number is odd it appears that
>>>     the controller will try to transfer right away.  Same thing if you set
>>>     "even" and the current frame number is even.
>>> - If the oddness you set and the oddness of the frame number are
>>>     _different_, the transfer will be delayed until the frame number
>>>     changes.
>>>
>>> As I understand it, the above technique allows you to plan ahead of time
>>> where possible by always working on the next frame.  ...but it still
>>> allows you to properly respond immediately to things that happened in
>>> the previous frame.
>>>
>>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>>> It always looked at the frame number and setup the transfer to happen in
>>> the next frame.  In some cases that meant that certain transactions
>>> would be transferred in the wrong frame.
>>>
>>> We'll try our best to set the even / odd to do the transfer in the
>>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>>> We'll also modify the scheduler code to handle this and not try to
>>> schedule a second transfer for the same frame.
>>>
>>> Note that this change relies on the work to redo the microframe
>>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>>> redo the microframe scheduler").
>>>
>>> With this change my stressful USB test (USB webcam + USB audio +
>>> keyboards) has less audio crackling than before.
>> Seems this really help for your case?
> Yes, I believe it does.  Of course my test case is pretty "black box"
> for the most part in that I play music on youtube while having a
> webcam open and several USB input devices connected.  I then try to
> decide whether I hear more static or less static.  ...clearly a less
> subjective test would be better...
>
> * I tried with http://crosreview.com/325451 (see below) and I hear
> more static with "use_old = true" than with "use_old = "false".
>
> * I tried with this entire patch reverted and I hear about the same
> static as with "use_old = true".
>
> Note that counting reported MISS lines from my logging also shows that
> the new code is better...
>
>
>> Do you check if the transfer can happen right in the current frame? I know
>> it's
>> quite difficult to check it, but this changes what I know for the dwc core
>> schedule the transaction.
> Yes.  I just tried again, too.  I coded up
> <https://chromium-review.googlesource.com/325451> and included it.  I
> then opened up a USB webcam.
>
> With things set to the old way:
>
>    115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
>    115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
>    115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
>    115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
>    115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
>    115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
>    115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
>    115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
>    115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
>    115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
>    115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
>    115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
>    115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
>    115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
>    115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
>    115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
>    115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
>    115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3
>
> With the new way:
>
>     87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
>     87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
>     87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
>     87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
>     87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>     87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
>     87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
>     87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
>     87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
>     87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb
>
>
> Note that with my TEST-ONLY patch the old way is still _slightly_
> different in that I still communicate back to the scheduler with:
>
>    chan->qh->next_active_frame = now_frame;
>
> The old code didn't used to do that.  If I don't do that then you
> you'll just stay in an inconsistent state for a while where things are
> going on the wire 1 frame later than we think they are.
>
>
> Also note that above you can see that the new way is indeed able to
> schedule things in the current microframe.  Looking one line at a
> time:
>
>
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>
> QH e2fd7880 is going straight to the ready queue.  Actual frame number
> in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
> want to give it to hardware in 32e6 and wire frame is 32e7.
>
>
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>
> Frame number in hardware is now 32e8.  We'd like to give the next
> transfer to hardware in 32e7 to transfer on the wire at 32e8, but
> that's obviously impossible.  We will try to give it right away.
>
>
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>
> Showing a difference in the old way.  We'll choose "even" to have the
> packet go on the wire (expecting 32e8).
>
>
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>
> We got a response back and are ready to schedule the next transfer and
> it's still 32e8!  That means that transfer must have happened (as
> expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
> for 32e9 wire.
>
>
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>
> Now at hardware 32e9 and ready to schedule the next...
>
>
>
>> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
>> IN/OUT)
>> in DMA Mode, the normal Interrupt OUT operation says:
>> The DWC_otg host attempts to send out the OUT token in the beginning of next
>> odd frame/microframe.
>>
>> So I'm confuse about if the dwc core can do the transaction at the same
>> frame
>> of host channel initialized or not.
> The docbook is obviously way too terse here, but the above experiment
> shows that the hardware is designed in the only sane way that it could
> be designed.
>
> Why do I say that this is the only sane way for the hardware to work?
> I think all the following is true (please correct any errors):
>
> A) HW only lets you specify even/odd which means you choose between
> two frame to send the packet.  Two possible ways HW could be
> implemented: "sane" way means you can send a packet in frame "x" and
> "x + 1".  "insane" way means you can send a packet in frame "x + 1"
> and "x + 2" but not frame "x"
>
> B) In some cases (especially with regards to SPLIT transfers), we need
> to use the result of a transfer in uFrame "x" to decide what to do
> about uFrame "x + 1".  Specifically for IN transfers I think we can't
> know for sure whether we'll get back all of our data in uFrame "x" or
> whether we'll only get part of the data and need uFrame "x + 1".
>
> C) It's possible to schedule 100us worth of periodic transfers in one
> 125us uFrame.
>
> D) We can't know the result of a transfer until that transfer is done.
>
>
> So above basically means that we might have a periodic transfer where
> we get the result of the transfer 100us into a uFrame.  We've now got
> to quickly queue up the transfer for the next uFrame.  If hardware was
> designed in the "insane" way then we'd need an interrupt latency of <
> 25 us since once the frame ticked we'd no longer be able to schedule.
> If hardware was designed in the "sane" way then we'd "only" need an
> interrupt latency of 125 us since we could continue to schedule even
> partway through the current frame.
>
> Also note that if there's any chance that a periodic transfer ends
> later than 100 us into a frame (like if a non-periodic transfer snuck
> in there because we were out of periodic channels) then the above
> problem becomes even more extreme.
>
>
>
> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives