lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 Apr 2011 12:30:36 +0200
From:	Tejun Heo <tj@...nel.org>
To:	Thilo-Alexander Ginkel <thilo@...kel.com>
Cc:	Arnd Bergmann <arnd@...db.de>, "Rafael J. Wysocki" <rjw@...k.pl>,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com
Subject: Re: Soft lockup during suspend since ~2.6.36 [bisected]

Hello,

On Thu, Apr 28, 2011 at 01:51:34AM +0200, Thilo-Alexander Ginkel wrote:
> Well, I get your point. ;-) Maybe this fact can help as a motivator: I
> ran some further tests and while -rc3 seems to be ok (and survived 100
> suspend/resume cycles), the issue strangely seems to be back with -rc4
> (the softlockup call stack that I can see is identical to the photos
> below; the lockup happened after only two cycles).
> 
> > Before I go ahead and try that, do you happen to have softlockup dump?
> > ie. stack traces of the stuck tasks?  I can't find the original
> > posting.
> 
> Sure:
> 
> From <BANLkTi=n4jLsjOYCd0L3hYb30sgPmdv_WA@...l.gmail.com>:
> > Unfortunately, the output via a serial console becomes garbled after
> > "Entering mem sleep", so I went for patching dumpstack_64.c and a
> > couple of other source files to reduce the verbosity. I hope not to
> > have stripped any essential information. The result is available in
> > these pictures:
> >   https://secure.tgbyte.de/dropbox/IeZalo4t-1.jpg
> >   https://secure.tgbyte.de/dropbox/IeZalo4t-2.jpg
> >
> > For both traces, the printed error message reads: "BUG: soft lockup -
> > CPU#3 stuck for 67s! [kblockd:28]"

Does your kernel have preemption enabled?  If not, does the following
patch fix the problem?

Thanks.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 04ef830..08c7334 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1293,6 +1293,7 @@ __acquires(&gcwq->lock)
 
 		/* CPU has come up inbetween, retry migration */
 		cpu_relax();
+		cond_resched();
 	}
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ