lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Apr 2011 03:39:55 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Chris Mason <chris.mason@...cle.com>,
	Frank Rowand <frank.rowand@...sony.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Mike Galbraith <efault@....de>,
	Oleg Nesterov <oleg@...hat.com>, Paul Turner <pjt@...gle.com>,
	Jens Axboe <axboe@...nel.dk>,
	Yong Zhang <yong.zhang0@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 04/21] sched: Change the ttwu success details

Hello, Peter.

On Wed, Apr 13, 2011 at 01:06:39PM +0200, Peter Zijlstra wrote:
> On Wed, 2011-04-13 at 12:48 +0200, Peter Zijlstra wrote:
> > Appears to be sufficient to cause the lockup, so somehow the whole
> > workqueue stuff relies on the fact that waking a TASK_(UN)INTERRUPTIBLE
> > task that hasn't been dequeued yet isn't a wakeup.
> > 
> > Tejun any quick clues as to why and how to cure this?
> > 
> > /me goes read that stuff
> 
> OK, so wq_worker_waking_up() does an atomic_inc() that wants to be
> balanced against the atomic_dec() in wq_worker_sleeping(), which is only
> called when we dequeue things.

Yeap, the root cause of the problem is that the change makes
wq_worker_sleeping() and wq_worker_waking_up() asymmetric and thus
puts the nr_running counter goes out of sync which hides active worker
depletion from the workqueue code leading to stall.

One way to deal with it would be adding an extra worker flag to track
sleep state from workqueue side so that it can filter out spurious
wakeups; however, I think it would be far better to resolve this from
scheduler side.  If the callback name is misleading, rename it to
wq_worker_sched_activated() or something and call it only when the
task gets activated.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ