lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090205180015.GA28738@redhat.com>
Date:	Thu, 5 Feb 2009 19:00:15 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Eric Dumazet <dada1@...mosbay.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/3] workqueue: not allow recursion run_workqueue

On 02/05, Frederic Weisbecker wrote:
>
> On Thu, Feb 05, 2009 at 06:01:56PM +0100, Oleg Nesterov wrote:
> > On 02/05, Lai Jiangshan wrote:
> > >
> > > DEADLOCK EXAMPLE for explain my above option:
> > >
> > > (work_func0() and work_func1() are work callback, and they
> > > calls flush_workqueue())
> > >
> > > CPU#0					CPU#1
> > > run_workqueue()                         run_workqueue()
> > >   work_func0()                            work_func1()
> > >     flush_workqueue()                       flush_workqueue()
> > >       flush_cpu_workqueue(0)                  .
> > >       flush_cpu_workqueue(cpu#1)              flush_cpu_workqueue(cpu#0)
> > >         waiting work_func1() in cpu#1           waiting work_func0 in cpu#0
> > >
> > > DEADLOCK!
> > 
> > I am not sure. Note that when work_func0() calls run_workqueue(),
> > it will clear cwq->current_work, so another flush_ on CPU#1 will
> > not wait for work_func0, no?
>
> No but CPU#1 can wait for a completion that will never be done, because
> CWQ#0 is waiting for CWQ#1.

Still can't understand. When work_func0()->run_workqueue() returns,
we should have no works in ->worklist and ->current_work must be NULL.
If we have a barrier which was inserted before - it should be flushed.


But yes, deadlock is possible, if other works come after run_workqueue()
returns and before work_func1() starts the flush. Just the description is
not exactly accurate, imho.

And we have other problems. Just to say, nothing can guarantee that
run_workqueue() will ever return. It is correct if some work_struct
always re-queues itself and should be cancelled before destroy_workqueue().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ