lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 22 Apr 2009 22:53:52 -0400 (EDT)
From:	Alan Stern <stern@...land.harvard.edu>
To:	Rogério Brito <rbrito@....usp.br>
cc:	Robert Hancock <hancockrwd@...il.com>,
	<linux-kernel@...r.kernel.org>, <linux-usb@...r.kernel.org>
Subject: Re: [2.6.30-rc2] usb reset during big file transfer and ext3 error

On Wed, 22 Apr 2009, [utf-8] Rogério Brito wrote:

> > According to the EHCI spec, XactErr is "Set to a one by the Host  
> > Controller during status update in the case where the host did not  
> > receive a valid response from the device (Timeout, CRC, Bad PID,
> > etc.)"
> 
> Is there any way of controlling the number of retries in the host
> controller? Or, perhaps, of controlling the time between retries so that
> the device can shape it up again?

It's not all that simple.  The host controller allows the OS to set the
number of hardware retries to 1, 2, 3, or unlimited.  Linux uses 3;  
those XactErr debugging messages in your log show that the driver was
extending the number of retries in software.

It's not possible to change the time interval between retries done by
the hardware.  While it is possible in theory to change the interval
between retries done by the driver, it would be rather difficult and
so ehci-hcd doesn't attempt it.

The software retries were introduced to solve one particular problem:  
Many EHCI controllers will generate a transaction error if a data
transfer is occurring on one port at the same time as a device is being
unplugged on another port.  This is clearly a hardware bug, and the
software retries were intended to work around it.  In practice only a
couple of software retries are needed; if the transfer hasn't succeeded
by that point then it's never going to succeed.  I set the upper limit
to 32 retries just to be conservative.

Delaying longer in order to allow the device to shape itself up is
generally hopeless.  I've haven't seen more than one or two cases where
that would work -- and it's quite possible that those cases would have
worked out okay if the software retry mechanism had existed back when
they occurred.  If transaction errors aren't caused by noise in the
cable then they are almost always caused by bugs or failures in the
device.  Once a device's firmware has crashed, it doesn't magically fix
itself.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ