lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.1302141804400.3330@eggly.anvils>
Date:	Thu, 14 Feb 2013 18:09:57 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	Dave Jones <davej@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	paul.mckenney@...aro.org
Subject: Re: Debugging Thinkpad T430s occasional suspend failure.

On Thu, 14 Feb 2013, Dave Jones wrote:
> On Wed, Feb 13, 2013 at 11:56:25AM -0800, Linus Torvalds wrote:
>  > On Wed, Feb 13, 2013 at 11:34 AM, Dave Jones <davej@...hat.com> wrote:
>  > >
>  > > My test was a loop of 100 suspend/resume cycles before calling something
>  > > 'good'. The 'bad' cases all failed within 10 cycles (usually 2-3).
>  > 
>  > Considering that you apparently already found one case where the BIOS
>  > crapped out due to effectively unrelated timing details (ie timing
>  > triggered a temperature issue that then triggered behavioral changes),
>  > I wonder if your more occasional problem might not be a sign of
>  > something similar.
>  > 
>  > But since you seem to be able to automate it well, maybe one thing to
>  > try is to change the timing a bit while testing. Maybe some failures
>  > were hidden by the timing just happening to work out.
> 
> Given I never saw this on a Fedora kernel, just my self-built ones, I eventually
> gave up on bisecting code, and switched to bisecting config options.
> I should have started this way, as I figured it out within an hour.
> 
> 3.7 merge window is when I started seeing this, and here's what got introduced
> during that time..
> 
> commit e3ebfb96f396731ca2d0b108785d5da31b53ab00
> Author: Paul E. McKenney <paul.mckenney@...aro.org>
> Date:   Mon Jul 2 14:42:01 2012 -0700
> 
>     rcu: Add PROVE_RCU_DELAY to provoke difficult races
> 
> 'difficult' is an understatement.  This explains why some of those 'good'
> bisects survived 100 suspends on one day, and failed the next.
> 
> Unfortunatly, I don't think there's any sane way to retrieve whatever debug
> info might be getting spewed.  Perhaps when I reinstall, and switch to booting EFI
> I'll be able to use pstore, but on a bios-based boot, all hope seems lost.
> No netconsole, no usb-serial, even crippling i915's suspend routine doesn't help.
> 
> I'll just disable this option for now.

Which won't affect my case since I never enabled it.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ