lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPBYUsCEjMSJ8P8ZM1_W+S1DOWFTOM0wJwi2fTukfxSGucYhnQ@mail.gmail.com>
Date:   Mon, 5 Sep 2022 16:36:16 +0800
From:   Ray Chi <raychi@...gle.com>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        mathias.nyman@...ux.intel.com,
        Albert Wang <albertccwang@...gle.com>,
        Badhri Jagan Sridharan <badhri@...gle.com>,
        Puma Hsu <pumahsu@...gle.com>, linux-usb@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [Patch v2] usb: core: stop USB enumeration if too many retries

On Sat, Sep 3, 2022 at 1:07 AM Alan Stern <stern@...land.harvard.edu> wrote:
>
> On Sat, Sep 03, 2022 at 12:08:04AM +0800, Ray Chi wrote:
> > On Fri, Sep 2, 2022 at 10:49 PM Alan Stern <stern@...land.harvard.edu> wrote:
> > >
> > > On Fri, Sep 02, 2022 at 05:15:35PM +0800, Ray Chi wrote:
> > > > If a broken accessory connected to a USB host, usbcore might
> > > > keep doing enumeration retries and it will take a long time to
> > > > cause system unstable.
> > > >
> > > > This patch provides a quirk to specific USB ports of the hub to
> > > > stop USB enumeration if needed.
> > >
> > > This seems very awkward.  Why not have a quirk that prevents USB
> > > enumeration completely, instead of after some number of retries?  After
> > > all, if the port is connected to a broken accessory, there's no reason
> > > to try enumerating it even once.
> > >
> > > For that matter, have you tried using the existing "disabled" port
> > > attribute instead of adding a new quirk?  Does it already solve your
> > > problem?
> > >
> >
> > Since we don't know if the connected accessory is normal or broken, doing port
> > initialization is necessary.
>
> I don't understand.  If you don't know whether the accessory is broken,
> how do you know whether to set the quirk?
>
> On the other hand, if you always set the quirk even before you know
> whether the accessory is broken, why make it a quirk at all?  Why not
> make it the normal behavior of the driver?
>

Since our device has a watchdog mechanism, when the device connects to
a broken accessory, the kernel panic will happen. This problem didn't happen
in all USB Hosts, so I want to use the quirk to fix this problem for those hosts
with a watchdog mechanism.

> > > > +              * Some USB hosts can't take a long time to keep doing enumeration
> > > > +              * retry. After doing half of the retries, we would turn off the port
> > > > +              * power to stop enumeration if the quirk is set.
> > >
> > > What made you decide that half of the retries was the right place to
> > > stop?  Why not do all the retries?
> >
> > Since some normal devices will be timeout in the first attempt, I set
> > the condition to half
> > of the retries. All the retries will take 12*timeout seconds. It is
> > too long so that a watchdog
> > timeout problem may happen.
>
> Why not set CONFIG_USB_FEW_INIT_RETRIES instead?
>

https://source.android.com/docs/core/architecture/kernel/android-common
According to Android Common Kernel, I can't only add this config to one project.
In addition, it can't stop enumeration so that the timeout problem
still happens.

> > > If the quirk prevented enumeration completely then this function
> > > wouldn't be needed.
> >
> > The enumeration is still needed as above.
> >
> > >
> > > > +
> > > >  /* Check if a port is power on */
> > > >  int usb_port_is_power_on(struct usb_hub *hub, unsigned int portstatus)
> > > >  {
> > > > @@ -4855,6 +4879,11 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
> > > >                                       buf->bMaxPacketSize0;
> > > >                       kfree(buf);
> > > >
> > > > +                     if (r < 0 && (port_dev->quirks & USB_PORT_QUIRK_STOP_ENUM)) {
> > >
> > > How come this line tests the quirk but doesn't call
> > > hub_port_stop_enumerate()?
> >
> > Since the quirk is used to stop enumeration and reduce the total time.
> > If the port has the quirk, I think the port doesn't need to do
> > set_address after the port gets
> > failures in the new scheme. It will add 2 attempts * timeout (defined
> > in hc_driver) seconds.
>
> I still can't tell what you're trying to accomplish.  You need to do a
> much better job of explaining the point of this.  For instance, you
> might describe in detail a situation where the quirk is needed,
> explaining what sort of behavior of the system would lead you to set the
> quirk, and why.
>

There is a kernel panic when the device connects to the broken accessory.
I tried to modify the initial_descriptor_timeout. When the accessory is not
working, the total time is 6.5s (get descriptor retry) + 5*2 seconds
(set address of xhci timeout).
The time is so long to cause kernel panic for the device. This is why I want to
stop enumeration instead reducing the retries or timeout.

[16433.648337] usb 2-1: reset full-speed USB device number 2 using
xhci-hcd-exynos
[16435.311614] usb 2-1: device descriptor read/64, error -110
[16437.103767] usb 2-1: device descriptor read/64, error -110
[16437.339768] usb 2-1: reset full-speed USB device number 2 using
xhci-hcd-exynos
[16439.023868] usb 2-1: device descriptor read/64, error -110
[16440.815597] usb 2-1: device descriptor read/64, error -110
[16441.051575] usb 2-1: reset full-speed USB device number 2 using
xhci-hcd-exynos
[16446.063656] xhci-hcd-exynos xhci-hcd-exynos.4.auto: Timeout while
waiting for setup device command
[16446.953961] Kernel panic - not syncing: PM suspend timeout
[16446.978122] Workqueue: events_unbound async_run_entry_fn.cfi_jt
[16446.978136] Call trace:
[16446.978150]  __switch_to+0x260/0x4dc
[16446.978165]  __schedule+0x6c4/0xabc
[16446.978181]  schedule+0x12c/0x24c
[16446.978195]  schedule_timeout+0x48/0x138
[16446.978210]  wait_for_common+0x148/0x310
[16446.978238]  xhci_setup_device+0x470/0xe30
[16446.978250]  xhci_address_device+0x18/0x28
[16446.978340]  hub_port_init+0x5a0/0xfec
[16446.978356]  usb_reset_and_verify_device+0x710/0xb8c
[16446.978370]  usb_port_resume+0x5a8/0x780
[16446.978393]  usb_generic_driver_resume+0x28/0x60
[16446.978413]  usb_resume_both+0x16c/0x474
[16446.978436]  usb_dev_resume+0x2c/0x84
[16446.978446]  dpm_run_callback+0x50/0x250
[16446.978459]  device_resume+0x250/0x2f8
[16446.978473]  async_resume+0x28/0x12c
[16446.978492]  async_run_entry_fn+0x6c/0x3dc
[16446.978516]  process_one_work+0x24c/0x5bc
[16446.978530]  worker_thread+0x3e8/0xa50
[16446.978547]  kthread+0x150/0x1b4
[16446.978563]  ret_from_fork+0x10/0x30

> Alan Stern

Thanks,
Ray

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ