[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABTNMG3P3UNm-a=1iM_RbC9GXqKZGUkVWg=7ZyjhJV4C3Zarmg@mail.gmail.com>
Date: Thu, 15 Apr 2021 01:05:57 +0800
From: Chris Chiu <chris.chiu@...onical.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: gregkh@...uxfoundation.org, m.v.b@...box.com, hadess@...ess.net,
linux-usb@...r.kernel.org,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] USB: Don't set USB_PORT_FEAT_SUSPEND on WD19's Realtek Hub
On Wed, Apr 14, 2021 at 10:32 PM Alan Stern <stern@...land.harvard.edu> wrote:
>
> On Wed, Apr 14, 2021 at 01:07:43PM +0800, Chris Chiu wrote:
> > Thanks for the instructions. I can hit the same timeout problem with
> > runtime PM. The
> > fail rate seems the same as normal PM. (around 1/4 ~ 1/7)
> > root@:/sys/bus/usb/devices/3-4.3# echo auto > power/control
> > root@:/sys/bus/usb/devices/3-4.3# echo on > power/control
> > root@:/sys/bus/usb/devices/3-4.3# dmesg -c
> > [ 2789.679807] usb 3-4: kworker/7:0 timed out on ep0out len=0/0
> > [ 2789.679812] usb 3-4-port3: can't suspend, status -110
> > [ 2789.680078] usb 3-4.3: Failed to suspend device, error -110
>
> Since these are random failures, occurring at a low rate, maybe it would
> help simply to retry the transfer that timed out. Have you tested this?
>
The problem is the port seems to be dead (at least unresponsive) after
USB_PORT_FEAT_SUSPEND. If I turned on the xhci_hcd debug message,
I'll see lots of retries of the control URB as follows which never get acked
in failure cases.
[ 126.616105] xhci_hcd 0000:04:00.3: ep 0x81 - asked for 2 bytes, 1
bytes untransferred
I tried to increase the timeout from 1 second to 2 second and also tried
set USB_PORT_FEAT_SUSPEND again after timeout, but still get timeout.
That also explains why hub_ext_port_status returns -71 (EPROTO from xhci)
in hub_activate() during resuming since the port is in an unknown state.
> > > > > > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > > > > > index 7f71218cc1e5..8478d49bba77 100644
> > > > > > --- a/drivers/usb/core/hub.c
> > > > > > +++ b/drivers/usb/core/hub.c
> > > > > > @@ -3329,8 +3329,11 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
> > > > > > * descendants is enabled for remote wakeup.
> > > > > > */
> > > > > > else if (PMSG_IS_AUTO(msg) || usb_wakeup_enabled_descendants(udev) > 0)
> > > > > > - status = set_port_feature(hub->hdev, port1,
> > > > > > - USB_PORT_FEAT_SUSPEND);
> > > > > > + if (udev->quirks & USB_QUIRK_NO_SET_FEAT_SUSPEND)
> > > > >
> > > > > You should test hub->hdev->quirks, here, not udev->quirks. The quirk
> > > > > belongs to the Realtek hub, not to the device that's plugged into the
> > > > > hub.
> > > > >
> > > >
> > > > Thanks for pointing that out. I'll verify again and propose a V2 after
> > > > it's done.
> > >
> > > Another thing to consider: You shouldn't return 0 from usb_port_suspend
> > > if the port wasn't actually suspended. We don't want to kernel to have
> > > a false idea of the hardware's current state.
> > >
> > So we still need the "really_suspend=false". What if I replace it with
> > the following?
> > It's a little verbose but expressive enough. Any suggestions?
> >
> > + else if (!(hub->hdev->quirks & USB_QUIRK_NO_SET_FEAT_SUSPEND) &&
> > + (PMSG_IS_AUTO(msg) || usb_wakeup_enabled_descendants(udev) > 0))
> > + status = set_port_feature(hub->hdev, port1,
> > + USB_PORT_FEAT_SUSPEND);
> > else {
> > really_suspend = false;
> > status = 0;
>
> You should do something more like this:
>
> - else if (PMSG_IS_AUTO(msg) || usb_wakeup_enabled_descendants(udev) > 0)
> - status = set_port_feature(hub->hdev, port1,
> - USB_PORT_FEAT_SUSPEND);
> - else {
> + else if (PMSG_IS_AUTO(msg) || usb_wakeup_enabled_descendants(udev) > 0) {
> + if (hub->hdev->quirks & USB_QUIRK_NO_SUSPEND)
> + status = -EIO;
> + else
> + status = set_port_feature(hub->hdev, port1,
> + USB_PORT_FEAT_SUSPEND);
> + } else {
> really_suspend = false;
> status = 0;
> }
>
> But I would prefer to find a way to make port suspend actually work,
> instead of giving up on it.
>
> Alan Stern
I tried to do that and also dig into the xhci code to check why the TD
(Transfer Descriptor) of the corresponding control msg URB was not
completed. Unfortunately, I didn't find a reasonable answer. I'll verify
the status -EIO and propose a v3 if everything's fine. Please let me
know if there's anything worth trying. Thanks.
Chris
Powered by blists - more mailing lists