lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <6f5be7a5-bf82-e857-5c81-322f2886099a@maciej.szmigiero.name>
Date:   Sat, 27 Mar 2021 16:55:20 +0100
From:   "Maciej S. Szmigiero" <mail@...iej.szmigiero.name>
To:     linux-usb@...r.kernel.org
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: >20 KB URBs + EHCI = bad performance due to stalls

Hi,

Is there any specific reason that URBs without URB_SHORT_NOT_OK flag that
span multiple EHCI qTDs have Alternate Next qTD pointer set to the dummy
qTD in their every qTD besides the last one (instead of to the first qTD
of the next URB to that endpoint)?

This causes that endpoint queue to stall in case of a short read that
does not reach the last qTD (I guess this condition persists until an
URB is (re)submitted to that endpoint, but I am not sure here).

One of affected drivers here is drivers/net/usb/r8152.c.

If I simply reduce its per-URB transfer buffer to 20 KB (the maximum
that fits in a well-aligned qTD) the RX rate increases from around
100 Mbps to 200+ Mbps (on an ICH8M controller):
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -6554,6 +6556,9 @@
                 break;
         }
  
+       if (tp->udev->speed == USB_SPEED_HIGH)
+               tp->rx_buf_sz = min(tp->rx_buf_sz, (u32)20 * 1024);
+
         return ret;
  }

The driver default is to use 32 KB buffers (which span two qTDs),
but the device rarely fully fills the first qTD resulting in
repetitive stalls and more than halving the performance.

As far as I can see, the relevant code in
drivers/usb/host/ehci-q.c::qh_urb_transaction() predates the git era.
The comment in that function before setting the Alternate Next qTD
pointer:
> /*
>  * short reads advance to a "magic" dummy instead of the next
>  * qtd ... that forces the queue to stop, for manual cleanup.
>  * (this will usually be overridden later.)
>  */

...suggests the idea was to override that pointer when
URB_SHORT_NOT_OK is not set, but this is actually done only for
the last qTD from the URB (also, that's the only one that ends
with interrupt flag set).

Looking at OHCI and UHCI host controller drivers the equivalent
limits seem to be different there (8 KB and 2 KB), while I don't
see any specific limit in the XHCI case.

Because of that variance in the URB buffer limit it seems strange
to me that this should be managed by a particular USB device driver
rather than by the host controller driver, because this would mean
every such driver would need to either use the lowest common
denominator for the URB buffer size (which is very small) or
hardcode the limit for every host controller that the device can
be connected to, which seems a bit inefficient.

Thanks,
Maciej

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ