lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36de9669-aee1-bd3e-b36e-3ecb1a314efc@roeck-us.net>
Date:   Tue, 25 Jul 2017 19:33:39 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Alan Stern <stern@...land.harvard.edu>,
        Zdenek Kabelac <zkabelac@...hat.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, dianders@...omium.org,
        gregkh@...uxfoundation.org
Subject: Re: USB disk speed regression WD Elements - with bisect result
 22547c4cc4fe20698a6a85a55b8788859134b8e4

On 07/25/2017 12:50 PM, Alan Stern wrote:
> On Tue, 25 Jul 2017, Zdenek Kabelac wrote:
> 
>> Dne 25.7.2017 v 19:02 Alan Stern napsal(a):
>>> On Tue, 25 Jul 2017, Zdenek Kabelac wrote:
>>>
>>>> And in fact it's the very same commit - which adds this message
>>>> (just check current 4.13 with and without this commit reverted)
>>>>
>>>> So here goes usbmon trace (aka  'cat /sys/kernel/debug/usb/usbmon/0u')
>>>> no other usb device has been touch so should not hopefully interfere here.
>>>>
>>>> Trace is from 4.12 kernel - so it has reported  "not running at top speed"
>>>
>>> Can you collect an equivalent trace under 4.8?
>>>
>>> Alan Stern
>>>
>>
>> Hi
>>
>>
>> Sure - no pb.
>>
>> Just attached & detached  USB disk in this trace
> 
> This is really peculiar.  The only difference is that the 4.12 trace
> shows an extra 250 ms delay (including two extra Get-Port-Status

Most likely we are now catching the long reset timeout. HUB_LONG_RESET_TIME
is 200 ms. It looks like the code is catching exactly the condition
addressed in my patch, ie USB_PORT_STAT_RESET is cleared but
USB_PORT_STAT_CONNECTION is not yet set. hub_port_wait_reset() will
now wait for USB_PORT_STAT_CONNECTION, which it didn't do before.

> requests) compared to the 4.8 trace.  I honestly can't tell what could
> be causing the switch from high speed to full speed, or why it would
> happen in one case but not the other.
> 

We talked about this today. What is causing the switch from high speed to
full speed ? Is this triggered by the kernel, or by the USB controller ?

If it is triggered by the USB controller, maybe the changed timing around
repeated hub resets, either in combination with an additional hub reset
or in combination with fewer resets, results in the full speed connection.
That is just a wild guess, of course.

Guenter

> It's obvious that _something_ is confusing the USB hardware in the disk
> drive.  This shows up in both traces: After you unplug and replug the
> drive, it initially connects at high-speed and then disconnects itself
> before the computer can communicate with it.  Then a second connection
> occurs.  For example, in the 4.8 log:
> 
> 08:18 usb 2-2: USB disconnect, device number 2
> 08:25 usb 2-2: new high-speed USB device number 3 using ehci-pci
> 08:26 usb 2-2: new high-speed USB device number 4 using ehci-pci
> 
> If the drive were working entirely correctly, it wouldn't do that.
>  > We could continue futzing around with hardware and driver tests for a
> long time.  But there may be a shortcut: If you have a USB hub, you
> could try attaching the drive through it.  It's entirely possible that
> this will fix the problem.
> 
> If not, you'll have to start doing some very detailed tests.  As a
> start, you can enable debugging for the usbcore and ehci_hcd drivers
> immediately before the test:
> 
> echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
> echo 'module ehci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
> dmesg -C
> 
> Then after the test, see what shows up in the dmesg output.  And again,
> we'll want to do a comparison.  In fact, 4.12 with and without the
> commit you identified would make a better comparison than 4.12 vs. 4.8.
> 
> Alan Stern
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ