lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 10 Nov 2012 17:51:19 +0100
From:	Martin Steigerwald <Martin@...htvoll.de>
To:	linux-kernel@...r.kernel.org
Cc:	Chuansheng Liu <chuansheng.liu@...el.com>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>
Subject: Re: [REGRESSION] 3.7-rc3+git hard lockup on CPU after inserting/removing USB stick

Am Samstag, 10. November 2012 schrieb Martin Steigerwald:
> CC´d bad patch author Chuansheng as well as Ingo and Thomas as the
> issue seems to be thread irqs related.
> 
> Am Mittwoch, 7. November 2012 schrieb Martin Steigerwald:
> > Am Mittwoch, 7. November 2012 schrieb Greg Kroah-Hartman:
> > > On Wed, Nov 07, 2012 at 03:01:38PM +0100, Martin Steigerwald wrote:
> > > > Hi!
> > > > 
> > > > I had this with something in between 3.7-rc3 und 3.7-rc4 after
> > > > inserting and removing an USB stick. This example is with a
> > > > kernel + f2fs patches v3, but I had this with 3.7-rc3 as well.
> > > 
> > > Ok, so it's not a new thing introduced in 3.7-rc4 (which is good,
> > > as there wasn't any USB patches added between -rc3 and -rc4.)
> > > 
> > > Does it also happen on -rc2?  Anything older?  Can you run 'git
> > > bisect' to try to track it down?
> > 
> > It appears to be worse with 3.7-rc1. The machine basically locked up
> > a few moments after inserting the stick.
> > 
> > First time I was on some tty and I saw lots of backtraces flowing by
> > the process of which the BTRFS on /, which resides on an unrelated
> > internal Intel SSD 320, was switched to read only. There have been
> > pauses between backtraces. Second I was in KDE session which
> > basically locked up soon as well. No mouse pointer movements where
> > possible, no switching to tty1.
> > 
> > I only have the last part of the backtrace of the first occurence as
> > photo.
> > 
> > Nothing was saved on SSD.
> > 
> > I do not want to go an earlier 3.7 version than rc1 on this
> > production machine.
> 
> I bisected this after having made a backup:

[… bisect log and some explainations …]

> The first bad commit is:
> 
> commit 73d4066055e0e2830533041f4b91df8e6e5976ff
> Author: Chuansheng Liu <chuansheng.liu@...el.com>
> Date:   Tue Sep 11 16:00:30 2012 +0800
> 
>     USB/host: Cleanup unneccessary irq disable code
> 
>     Because the IRQF_DISABLED as the flag is now a NOOP and has been
>     deprecated and in hardirq context the interrupt is disabled.
> 
>     so in usb/host code:
>     Removing the usage of flag IRQF_DISABLED;
>     Removing the calling local_irq save/restore actions in irq
>     handler usb_hcd_irq();
> 
>     Signed-off-by: liu chuansheng <chuansheng.liu@...el.com>
>     Acked-by: Alan Stern <stern@...land.harvard.edu>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> 
> 
> But:
> 
> This ony happens with threadirqs option!

Just another note:

irq/16-ehci_hcd was taking >99% CPU. It had PR -51 in top and I think this 
was the task that made the CPU core stuck.

I have a short dmesg piece from the bad commit kernel. Attached.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

Download attachment "dmesg-on-bad-commit-kernel.txt.xz" of type "application/x-xz" (15620 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ