lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090501091501.GA29466@ime.usp.br>
Date:	Fri, 1 May 2009 06:15:01 -0300
From:	Rogério Brito <rbrito@...rs.sf.net>
To:	Alan Stern <stern@...land.harvard.edu>
Cc:	Robert Hancock <hancockrwd@...il.com>,
	linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org
Subject: Re: [2.6.30-rc2] usb reset during big file transfer and ext3 error

Hi, Alan.

Sorry for the late reply, but I had some problems with an HD of mine
giving me trouble. :-( Of course, I have backups. :-)

On Apr 22 2009, Alan Stern wrote:
> On Wed, 22 Apr 2009, Rogério Brito wrote:
> > Is there any way of controlling the number of retries in the host
> > controller? Or, perhaps, of controlling the time between retries so
> > that the device can shape it up again?
> 
> It's not all that simple.  The host controller allows the OS to set the
> number of hardware retries to 1, 2, 3, or unlimited.  Linux uses 3;  
> those XactErr debugging messages in your log show that the driver was
> extending the number of retries in software.

Right. I didn't know that. Obviously, setting it to unlimited can give
undefined behavior of the computer.

> It's not possible to change the time interval between retries done by
> the hardware.  While it is possible in theory to change the interval
> between retries done by the driver, it would be rather difficult and
> so ehci-hcd doesn't attempt it.

Oh, what a pity. It seems that the device at hand sort of gets in shape
again after some time, since I have an automounter here and the device
nodes appear again under dev and it auto-mounts the device at the
appropriate mount point. Weird.

> The software retries were introduced to solve one particular problem:
> Many EHCI controllers will generate a transaction error if a data
> transfer is occurring on one port at the same time as a device is
> being unplugged on another port.

Right. I just got myself a (non powered) USB hub and I noticed one thing
(unrelated to this problem): if I plug a USB disk to this hub and, then,
plug a printer, very weird things happen, like the disc being unmounted
or things like that.

I can give you precise details of what happens here, if you're
interested.

OTOH, I think that I may be seeing some other problems with a pen drive
being connected to a port of this machine I'm typing this message on. I
will try to compile a newer kernel, now that -rc4 is released and I
would appreciate if you could help me with the issues that I'm seeing.

> This is clearly a hardware bug, and the software retries were intended
> to work around it.  In practice only a couple of software retries are
> needed; if the transfer hasn't succeeded by that point then it's never
> going to succeed.  I set the upper limit to 32 retries just to be
> conservative.

OK. Thanks for the nice and clear explanation of the problem. I only
wonder why I not seeing these errors on other machines while I *do* see
them on other machines (this one is an intel ICH5).

> If transaction errors aren't caused by noise in the cable then they
> are almost always caused by bugs or failures in the device.

I will try again with a shorter and newer cable. Let's see how that
works. BTW, is there any way to check the quality of a cable? I have a
multimeter here and I would be willing to do some extensive tests.
Testing the USB enclosure is also pretty feasible.

> Once a device's firmware has crashed, it doesn't magically fix itself.

Oh, what a pity that it doesn't recovers itself with a watchdog-like
mechanism.


Thanks for all your help, Rogério.

-- 
Rogério Brito : rbrito@...ckenzie,ime.usp}.br : GPG key 1024D/7C2CAEB8
http://www.ime.usp.br/~rbrito : http://meusite.mackenzie.com.br/rbrito
Projects: algorithms.berlios.de : lame.sf.net : vrms.alioth.debian.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ