linux-kernel - RE: RESEND: About VBUS glitch happen on DWC3 host mode enabling process.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM5PR0402MB2865A642E09A360D7F140516F1D40@AM5PR0402MB2865.eurprd04.prod.outlook.com>
Date:   Fri, 23 Nov 2018 09:40:17 +0000
From:   Ran Wang <ran.wang_1@....com>
To:     Felipe Balbi <balbi@...nel.org>,
        Mathias Nyman <mathias.nyman@...ux.intel.com>
CC:     "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: RESEND: About VBUS glitch happen on DWC3 host mode enabling
 process.

Hello Felipe,

Felipe Balbi <balbi@...nel.org> wrote: 
> 
> Hi Ran,
> 
> Ran Wang <ran.wang_1@....com> writes:
> >> > Then, DWC3 core driver continued to call function
> >> > dwc3_host_init()->platform_device_add(xhci)…
> >> > xhci_plat_probe()->usb_add_hcd()->xhci_plat_setup()->xhci_gen_setup
> >> > -> xhci_reset(), which would reset xHCI controller. At this point,
> >> > the VBUS EN pin (USB_DRVVBUS) was pulled low for about 15us,
> >> > causing the
> >>
> >> why is that pin pulled low? XHCI reset shouldn't be a global reset.
> >> Did your HW engineer tie all reset lines together? If so, there's
> >> nothing I can do to help.
> >
> > That's the point I also want to make clear: do you mean that the VBUS
> > control signal come from DWC3 IP should not be pulled down when xHCI
> > controller conduct reset?
> > And sorry that I am not quite sure about the 'global reset' you
> > mentioned. Do you mean to a DWC3 global reset or SoC reset?
> >
> > My understanding is that since VBUS control signal only be meaningful
> > in USB host mode (xHCI), so it might be in the scope/control of xHCI
> > controller, meaning that xHCI reset trigger VBUS/USB_DRVVBUS(EN)
> > pulled low might make sense, am I right? And the information come from
> > DWC3 IP design has confirmed that PORTSC[PP] will be de-asserted
> > during HCRST, it seems this is native behavior on
> > DWC3 IP.
> 
> okay, so the thing is about PP being dropped. Right, that should happen
> indeed. However, this still shouldn't cause any problems, since peripheral
> side shouldn't connect its pull-ups until VBUS is above session valid
> threshold.
> 
> For how long is VBUS dropped in this case?

The duration of VBUS drop is about 7.5ms (for USB_DRVVBUS is about 22us)
I have to admitted only that 2 brands of USB drives encountered failures, others are
fine, according my test results. Just thinking that this glitch properly trigger those
potential defect of that USB drives on the market which might not totally
follow USB spec, so like to do something in SW side to make host more robust.
> 
> >> > VBUS did the same drop too, then back to normal voltage when HW
> >> > reset complete. We have confirmed this whole process according to
> >> > scope waveform with test code on DWC3 driver. Impact is that VBUS
> >> > glitch has let some USB drives (such as Transcend 4GB USB2.0
> >> > (jetflash) and Kingston 16GB USB2.0 DTGE9) malfunction during
> >> > enumeration (particularly happen when drive is connected to
> >> > root-hub port prior to Linux boot).
> >>
> >> okay
> >>
> >> > Per my understanding, VBUS need to keep +5V once enabled without
> >> > any drop/unstable. And above glitch looks like caused by the gap
> >> > between
> >> > DWC3 design and driver init procedure.
> >>
> >> why are you blaming the driver here? We don't know of any such
> >> platform that has problems with this. Do you mean to say that because
> >> your HW engineer made a choice of tying host reset to global reset,
> >> you end up having an issue? That's something else entirely that SW can't
> help you with.
> >
> > I didn't mean to blame driver alone, just found the time interval
> > between host mode enabling and host reset causing a observable VBUS
> > control signal glitch happen we didn't expected. And experiments
> > proved that VBUS on between host mode enabling and host reset might
> not be necessary and can avoid this potential risk.
> >
> >> I have no idea about anything nxp has done, no access to
> >> documentation, nothing at all. I need you to do a better job at
> >> explaining the situation starting with kernel version you're using,
> >> if platform is supported upstream, etc.
> >
> > Please see my above answer.
> > These Layerscape platforms are support upstream, I can run them with
> > pure upstream build directly.
> 
> that's good, then we can debug this. Can you collect xhci tracepoints of
> when the problem happens?
 
Sorry, did you mean open xhci dynamic printk support for xhci? 

Actually I have debugged this for a while, the enumeration failure is due
to that USB drive reported another USB device descriptor once encounter
VBUS glitch. It's interesting. Look like it suddenly become another USB drive
and finally fail at SCSI protocol communication (TEST UNIT READY feedback),
I attach the snapshot pic of USB trace log to this mail, not sure if you can
receive it.

My judgement on this is that USB drive might has multiple device config
information stored in EPROM and report the wrong one in some corner
case (like encounter VBUS glitch) by accidently. And obviously that chosen
device config is not ready to behave as a Mass Storage/SCSI device. I have
checked these 2 different brand of drives and they both have the same issue
(even the wrong device descriptors are different!), it make me wondering
It's possible that there are any other drives have same issue existing on the market.

> >> > One of probably workaround come to my mind is to program all
> >> > root-hub ports’ PORTSC[PP] to 0b immediately after enabling host
> >> > mode (calling dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_HOST)), so
> VBUS
> >> > will keep 0V till xhci is reset by xhci driver like above. I have
> >> > test this and it works.
> >>
> >> dwc3 will _not_ touch xHCI registers, sorry. If you need something
> >> like that, you need to do it as a quirk in xhci-plat.c
> >
> > Thanks for pointing out a direction for me. If we do it as a quirk in
> > xhci-plat.c, how can we control it by some kind of DTS property in board
> level config?
> 
> If, indeed, there is a quirk here, then a quirk can be passed from dwc3 to
> xhci-plat, yes.

For this I just did a experiment on xhci-plat, it did can fix this issue but the timing
seems too late: make VBUS waveform look like a square wave as below:

                         Here DWC3 enable host mode, VBUS on
+5V                   /---------\    40ms   /---------------------------....
0V  ________/   90ms   \______/
                                           |           Here do xhci reset, VBUS back to +5V again
                                           Here set all PORTSC[PP] to 0
So I am afraid the solution might have to be added in DWC3 core driver
where just after host mode enabling code if want fix this :(

Regards,
Ran
> 
> ps: Mathias, did you see any behavior like this? A drop in VBUS voltage
> causing issues during enumeration?
> 
> --
> balbi

Download attachment "fail.png" of type "image/png" (121545 bytes)