lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 19 May 2009 14:00:10 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Johannes Berg <johannes@...solutions.net>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Zdenek Kabelac <zdenek.kabelac@...il.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: INFO: possible circular locking dependency at
	cleanup_workqueue_thread

On 05/19, Johannes Berg wrote:
>
> On Mon, 2009-05-18 at 21:47 +0200, Oleg Nesterov wrote:
>
> > > Maybe it shouldn't do that from the CPU_POST_DEAD
> > > notifier?
> >
> > Well, in any case we should understand why we have the problem, before
> > changing the code. And CPU_POST_DEAD is not special, why should we treat
> > it specially and skip lock_map_acquire(wq->lockdep_map) ?
>
> I'm not familiar enough with the code -- but what are we really trying
> to do in CPU_POST_DEAD? It seems to me that at that time things must
> already be off the CPU, so ...?

Yes, this cpu is dead, we should do cleanup_workqueue_thread() to kill
cwq->thread.

> On the other hand that calls
> flush_cpu_workqueue() so it seems it would actually wait for the work to
> be executed on some other CPU, within the CPU_POST_DEAD notification?

Yes. Because we can't just kill cwq->thread, we can have the pending
work_structs so we have to flush.

Why can't we move these works to another CPU? We can, but this doesn't
really help. Because in any case we should at least wait for
cwq->current_work to complete.

Why do we use CPU_POST_DEAD, and not (say) CPU_DEAD to flush/kill ?
Because work->func() can sleep in get_online_cpus(), we can't flush
until we drop cpu_hotplug.lock.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ