lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Jan 2017 01:33:52 +0000
From:   John Youn <John.Youn@...opsys.com>
To:     Felipe Balbi <balbi@...nel.org>,
        John Youn <John.Youn@...opsys.com>,
        Baolin Wang <baolin.wang@...aro.org>
CC:     Greg KH <gregkh@...uxfoundation.org>,
        Mark Brown <broonie@...nel.org>,
        USB <linux-usb@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] usb: dwc3: core: Disable USB2.0 phy suspend when dwc3
 acts as host role

On 1/16/2017 2:38 AM, Felipe Balbi wrote:
>
> Hi,
>
> John Youn <John.Youn@...opsys.com> writes:
>>> Baolin Wang <baolin.wang@...aro.org> writes:
>>>>> Baolin Wang <baolin.wang@...aro.org> writes:
>>>>>> When dwc3 controller acts as host role with attaching slow speed device
>>>>>> (like mouse or keypad). Then if we plugged out the slow speed device,
>>>>>> it will timeout to run the deconfiguration endpoint command to drop the
>>>>>> endpoint's resources. Some xHCI command timeout log as below when
>>>>>> disconnecting one slow device:
>>>>>>
>>>>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1
>>>>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub
>>>>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port
>>>>>>                                  polling.
>>>>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status
>>>>>>                                  = 0x202a0
>>>>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100
>>>>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual
>>>>>>                                  port 0 status  = 0x2a0
>>>>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,
>>>>>>                                  ep 0x81, starting at offset 0xc406d210
>>>>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!
>>>>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB
>>>>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at
>>>>>>                                  0xc406d210 (dma).
>>>>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context
>>>>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1
>>>>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =
>>>>>>                                  ffffffc1112f0880 (virtual)
>>>>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)
>>>>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =
>>>>>>                                  ffffffc1112f0880 (0xc406d000 dma),
>>>>>>                                  new deq ptr = ffffff8002175220
>>>>>>                                  (0xc406d220 dma), new cycle = 1
>>>>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!
>>>>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,
>>>>>>                                  deq = @c406d220
>>>>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port
>>>>>>                                  polling.
>>>>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev
>>>>>>                                  ffffffc01ae08800
>>>>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop
>>>>>>                                  flags = 0x8, new add flags = 0x0
>>>>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev
>>>>>>                                  ffffffc01ae08800
>>>>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:
>>>>>>
>>>>>> ......
>>>>>>
>>>>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!
>>>>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout
>>>>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring
>>>>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match
>>>>>>                                  command
>>>>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure
>>>>>>                                  endpoint command
>>>>>>
>>>>>> The reason is it will suspend USB phy to disable phy clock when
>>>>>> disconnecting the slow USB decice, that will hang on the xHCI commands
>>>>>> executing which depends on the phy clock.
>>>>>>
>>>>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host
>>>>>> role.
>>>>>>
>>>>>> Signed-off-by: Baolin Wang <baolin.wang@...aro.org>
>>>>>> ---
>>>>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++
>>>>>>  1 file changed, 14 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>>>>> index 9a4a5e4..0b646cf 100644
>>>>>> --- a/drivers/usb/dwc3/core.c
>>>>>> +++ b/drivers/usb/dwc3/core.c
>>>>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)
>>>>>>       if (dwc->revision > DWC3_REVISION_194A)
>>>>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;
>>>>>>
>>>>>> +     /*
>>>>>> +      * When dwc3 controller acts as host role with attaching one slow speed
>>>>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed
>>>>>> +      * device, it will timeout to run the deconfiguration endpoint command.
>>>>>> +      * The reason is it will suspend USB phy to disable phy clock when
>>>>>> +      * disconnecting slow speed decice, which will affect the xHCI commands
>>>>>> +      * executing.
>>>>>> +      *
>>>>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as
>>>>>> +      * host role.
>>>>>> +      */
>>>>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)
>>>>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;
>>>>>
>>>>> which version of the core you're using? Recent version (since 1.94A,
>>>>
>>>> My version is 2.80a.
>>>>
>>>>> IIRC) can manage core suspend automatically. Also, this patch of yours
>>>>> will cause a power consumption regression.
>>>>
>>>> Yes, it can manage core suspend automatically, that is the problem.
>>>> When plugging out one mouse or keypad device, the phy will suspend
>>>> automatically to disable the phy clock. But now the disconnecting
>>>> process is not finished, and some xHCI commands (like deconfiguration
>>>> endpoint command to drop endpoint resources) need depend on the phy
>>>> clock, which will hang on the system to timeout the command or abort
>>>> command ring to halt the xHCI.
>>>>
>>>> I agree with you it will cause a power consumption regression, but it
>>>> will cause serious problem if not. Do you have some suggestion?
>>>
>>> sorry for the long delay. This was lost in my inbox.
>>>
>>> I'm not sure this patch is the best solution. There's no mention in
>>> Databook that we should avoid PHY suspend when acting as host. Adding
>>> John here to see if John has any idea of how to fix this.
>>>
>>
>> I'm not familiar enough with XHCI side of things to say.
>>
>> I'll ask around to see if anyone has an idea.
>

Hi Felipe, Baolin,

I talked with a couple engineers here and the behavior is not
something that's expected in host mode.

Can you check that the value of the GCTL.RAMCLKSEL is set
appropriately? This affects where the core gets the clock signal
from. If it is getting it from the phy clock then you will likely have
this problem and will need to adjust it. Otherwise you should probably
use the existing quirk instead.

Regards,
John

Powered by blists - more mailing lists