[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090126222405.GA15896@elte.hu>
Date: Mon, 26 Jan 2009 23:24:05 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, a.p.zijlstra@...llo.nl,
rusty@...tcorp.com.au, travis@....com, mingo@...hat.com,
davej@...hat.com, cpufreq@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.
* Oleg Nesterov <oleg@...hat.com> wrote:
> On 01/26, Andrew Morton wrote:
> >
> > On Mon, 26 Jan 2009 22:45:16 +0100
> > Ingo Molnar <mingo@...e.hu> wrote:
> >
> > > that would change the concept of execution but indeed it would be
> > > interesting to try. It's outside the scope of late -rcs i guess, but
> > > worthwile nevertheless.
> > >
> >
> > Well it turns out that I was having a less-than-usually-senile moment:
> >
> > : commit b89deed32ccc96098bd6bc953c64bba6b847774f
> > : Author: Oleg Nesterov <oleg@...sign.ru>
> > : AuthorDate: Wed May 9 02:33:52 2007 -0700
> > : Commit: Linus Torvalds <torvalds@...dy.linux-foundation.org>
> > : CommitDate: Wed May 9 12:30:50 2007 -0700
> > :
> > : implement flush_work()
> > :
> > : A basic problem with flush_scheduled_work() is that it blocks behind _all_
> > : presently-queued works, rather than just the work whcih the caller wants to
> > : flush. If the caller holds some lock, and if one of the queued work happens
> > : to want that lock as well then accidental deadlocks can occur.
> > :
> > : One example of this is the phy layer: it wants to flush work while holding
> > : rtnl_lock(). But if a linkwatch event happens to be queued, the phy code will
> > : deadlock because the linkwatch callback function takes rtnl_lock.
> > :
> > : So we implement a new function which will flush a *single* work - just the one
> > : which the caller wants to free up. Thus we avoid the accidental deadlocks
> > : which can arise from unrelated subsystems' callbacks taking shared locks.
> > :
> > : flush_work() non-blockingly dequeues the work_struct which we want to kill,
> > : then it waits for its handler to complete on all CPUs.
> > :
> > : Add ->current_work to the "struct cpu_workqueue_struct", it points to
> > : currently running "struct work_struct". When flush_work(work) detects
> > : ->current_work == work, it inserts a barrier at the _head_ of ->worklist
> > : (and thus right _after_ that work) and waits for completition. This means
> > : that the next work fired on that CPU will be this barrier, or another
> > : barrier queued by concurrent flush_work(), so the caller of flush_work()
> > : will be woken before any "regular" work has a chance to run.
> > :
> > : When wait_on_work() unlocks workqueue_mutex (or whatever we choose to protect
> > : against CPU hotplug), CPU may go away. But in that case take_over_work() will
> > : move a barrier we queued to another CPU, it will be fired sometime, and
> > : wait_on_work() will be woken.
> > :
> > : Actually, we are doing cleanup_workqueue_thread()->kthread_stop() before
> > : take_over_work(), so cwq->thread should complete its ->worklist (and thus
> > : the barrier), because currently we don't check kthread_should_stop() in
> > : run_workqueue(). But even if we did, everything should be ok.
> >
> >
> > Why isn't that working in this case??
>
> Cough. Because that "flush_work()" was renamed to cancel_work_sync().
> Because it really cancells the work_struct if it can.
>
> Now we have flush_work() which does not cancel, but waits for completion
> of the single work_struct. Of course, it can hang if the caller holds
> the lock which can be taken by another work in that workqueue.
>
> Oleg.
Andrew's suggestion does make sense though: for any not-in-progress
worklet we can dequeue that worklet and execute it in the flushing
context. [ And if that worklet cannot be dequeued because it's being
processed then that's fine and we can wait on that single worklet, without
waiting on any other 'unrelated' worklets. ]
That does not help work_on_cpu() though: that facility really uses the
fact that workqueues are implemented via per CPU threads - hence we cannot
remove the worklet from the queue and execute it in the flushing context.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists