lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <485FC6C7.5030001@nokia.com>
Date:	Mon, 23 Jun 2008 18:52:39 +0300
From:	Stefan Becker <Stefan.Becker@...ia.com>
To:	ext Alan Stern <stern@...land.harvard.edu>,
	linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org
Subject: Re: [REGRESSION] 2.6.24/25: random lockups when accessing external
 USB harddrive

Hi,

[I'm not subscribed to this list, so please CC: me when you answer]

ext Alan Stern wrote:
> On Sun, 22 Jun 2008, Rene Herman wrote:
> 
>> On 22-06-08 18:55, Stefan Becker wrote:
>>
>>> I get random machine lockups when accessing my USB harddrive with 
>>> kernels 2.6.24/25. They don't occur with kernel 2.6.23. During testing I 
>>> figured out that it has something to do with the USB Bluetooth adaptor. 
>>> If I remove it before the testing I don't get any lockups.
> 
> Does the same problem still occur in 2.6.26-rc7?

Yes.


>  Does it occur if you rmmod ehci-hcd?

Yes, i.e. it also happens when the external hardrive runs as USB 1.1 
device with 12mpbs.


> Machine lockups are awfully hard to debug.  Can you get any information
> at all (like Alt-SysRq-T) when this happens?

SysRq does not work when the machine locks up. I forgot to mention that 
the test machine is a single CPU machine and that the CPU fan starts to 
run full speed when the lockup occurs.

Guessing from the commit returned by git bisect there is a locking 
error, i.e. the CPU runs into a spinlock that is already locked and 
therefore busy loops.


> Can you add debugging
> printk statements to the USB bluetooth driver to try and localize where
> the hang occurs?

Any suggestions where to start?


>>> git bisect resulted in the following bad commit:
>>>
>>> e9df41c5c5899259541dc928872cad4d07b82076 is first bad commit
>>> commit e9df41c5c5899259541dc928872cad4d07b82076
>>> Author: Alan Stern <stern@...land.harvard.edu>
>>> Date:   Wed Aug 8 11:48:02 2007 -0400
>>>
>>>     USB: make HCDs responsible for managing endpoint queues
> 
> Knowing this doesn't help much without more information.

Too bad. Each bisect cycle took 2-3 hours and the whole process took me 
3 days :-( :-(

That commit has spinlock changes so I hoped that it would be a good 
starting point. Is there a way to track the locks?


> Do you have any idea why nobody else has reported this sort of problem?  
> Is it reproducible on other machines?

I attached both USB devices to another, newer dual core laptop. I 
couldn't reproduce the problem there, even when I simulated a single CPU 
machine with maxcpus=1.

Regards,

	Stefan

---
Stefan Becker
E-Mail: Stefan.Becker@...ia.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ