lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 Nov 2009 21:00:58 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Mike Galbraith <efault@....de>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	pm list <linux-pm@...ts.linux-foundation.org>,
	Greg KH <gregkh@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jesse Barnes <jbarnes@...tuousgeek.org>
Subject: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd

On Monday 09 November 2009, Thomas Gleixner wrote:
> On Mon, 9 Nov 2009, Rafael J. Wysocki wrote:
> 
> > On Monday 09 November 2009, Mike Galbraith wrote:
> > > On Mon, 2009-11-09 at 16:47 +0100, Rafael J. Wysocki wrote:
> > > > On Monday 09 November 2009, Mike Galbraith wrote:
> > > 
> > > > > > Very likely.  What did you do to fix it?
> > > > > 
> > > > > You don't really wanna know.  In 31 with newidle enabled, the below
> > > > > fixed it.  It won't fix 32, though it might cure the resume problem.
> > > > 
> > > > OK, I'll give it a try.
> > 
> > It doesn't help.
> > 
> > Also, I can reproduce the issue with current -git and kernel preepmtion
> > disabled.
> > 
> > > I just tried to trigger badness via high speed online/offline combined
> > > with taskset with CONFIG_PREEMPT enabled, and couldn't make it explode.
> > 
> > I'm not able to do it this way too, so resume seems to be necessary to trigger
> > it.  I'm going try with the suspend debug in the "core" mode.
> > 
> > > (damn, wish i could s2ram this box)
> > 
> > That need not suffice.  I have two other boxes that suspend and resume
> > correctly with 2.6.32-rc, AFAICS.
> > 
> > However, there seems to be a systematic error somewhere, since the failure
> > always happens at the same place, ie. list_del_init(cwq->worklist.next); in
> > run_workqueue(), in preemptible as well as in non-preemptible kernels.
> > 
> > Which is kind of strange, given the !list_empty(&cwq->worklist) test right
> > before it.
> 
> Does that happen before or after the secondary CPU has been brought up ?

Way after.  It seems to happen more-or-less during or right after the thawing
of tasks.

Moreover, the call trace I get is (manual transcription):

? autoremove_wake_function+0x0
? worker_thread+0x0
kthread+0x69
child_rip+0xa

where kthread+0x69 is the do_exit(ret); in kthread().  Afterwards it says
that "events/0" exited with preempt_count = 1 (it sometimes is "events/1"
IIRC).

Still, RIP always points to list_del_init(cwq->worklist.next); in
run_workqueue().

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists