[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1213368184.16944.20.camel@twins>
Date: Fri, 13 Jun 2008 16:43:04 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Oleg Nesterov <oleg@...sign.ru>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Jarek Poplawski <jarkao2@...pl>,
Max Krasnyansky <maxk@...lcomm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] workqueues: insert_work: use "list_head *" instead of
"int tail"
On Fri, 2008-06-13 at 18:26 +0400, Oleg Nesterov wrote:
> On 06/12, Peter Zijlstra wrote:
> >
> > On Thu, 2008-06-12 at 21:44 +0400, Oleg Nesterov wrote:
> >
> > > > Hence that idea of flush context and completions.
> > >
> > > Do you mean something like (just for example) below? If yes, then yes
> > > sure, flush_work() is limited. But I can't see how it is possible to
> > > "generalize" this idea.
> > >
> > > (hmm... actually, if we add flush_work(), we can speedup schedule_on_each_cpu(),
> > > instead of flush_workqueue(keventd_wq) we can do
> > >
> > > for_each_online_cpu(cpu)
> > > flush_work(per_cpu_ptr(works, cpu));
> > >
> > > not sure this really makes sense though).
> >
> > Speedups are always nice ;-),
>
> OK, I'm sending the patch.
>
> > but the below also gets us there.
>
> yeah, and it needs only 1 wakeup. But otoh it is much more complex :(
>
> > > +struct xxx
> > > +{
> > > + atomic_t count;
> > > + struct completion done;
> > > + work_func_t func;
> > > +};
> > > +
> > > +struct yyy
> > > +{
> > > + struct work_struct work;
> > > + struct xxx *xxx;
> > > +};
> > > +
> > > +static void yyy_func(struct work_struct *work)
> > > +{
> > > + struct xxx *xxx = container_of(work, struct yyy, work)->xxx;
> > > + xxx->func(work);
> > > +
> > > + if (atomic_dec_and_test(&xxx->count))
> > > + complete(&xxx->done);
> > > +}
> > > ...
> >
> > Yes, along those lines.
> >
> > you can call xxx a flush_context and create an interface like:
> >
> > int queue_work_contex(struct workqueue_struct *wq,
> > struct flush_context *fc, struct work_struct *work)
> > {
> > work->context = fc;
> > return queue_work(wq, work);
> > }
> >
> > void flush_workqueue_context(struct workqueue_strucy *wq, t
> > struct flush_context *fc)
> > {
> > if (atomic_read(&context->count))
> > wait_for_completion(&fc->completion);
> > /* except that the above is racy, wait_event() comes to mind */
> > }
> >
> > of course run_workqueue() would then need to be augmented with something
> > like:
> >
> > context = work->context;
> > ...
> > f(work);
> > ...
> > if (context && atomic_dec_and_test(&context->count))
> > complete(&context->done);
>
> > also, I seem to have quitely ignored the fact that struct work doesn't
> > have the context pointer, and growing it unconditionally like this isn't
> > nice - hummm,. perhaps we have a bit left in data and can signify a
> > larger struct work_struct.. ?
>
> Yes, we have a free bit... but afaics we can do better.
>
> struct context_barrier {
> struct work_struct work;
> struct flush_context *fc;
> ...
> }
>
> void context_barrier_barrier_func(struct work_struct *work)
> {
> struct flush_context *fc = container_of();
> if (atomic_dec_and_test())
> ...
> }
>
> void insert_context_barrier(work, barr)
> {
> ...insert barr after work, like flush_work() does...
> }
>
> queue_work_contex(struct workqueue_struct *wq,
> struct work_struct *work,
> struct flush_context *fc)
> {
> int ret = queue_work(wq, work);
> if (ret)
> insert_context_barrier(work, barr);
> return ret;
> }
>
> this way we shouldn't change run_workqueue() and introduce a "parallel"
> larger work_struct which needs its own INIT_()/etc.
Where does do the context_barrier instances come from, are they
allocated in insert_context_barrier() ?
> However I'm a bit sceptical this will be widely used... I may be wrong.
I have no idea either - but it doesn't seem weird to have multiple
worklets outstanding without knowing which.
> > making all this PI savvy for -rt is going to be fun though.. I guess we
> > can just queue a normal barrier of the flusher's priority, and cancel it
> > once we complete.. hey - that doesn't sound hard at all :-)
>
> Yes!!! I think this is much better (because _much_ simple) than re-ordering
> the pending work_struct's, we can just boost the whole ->worklist. We can
> implement flush_work_pi() in the same manner as queue_work_contex() above.
> That is why I said previously that flush_() should govern the priority,
> not queue.
>
> But we can also implement queue_work_pi(struct work_struct_pi *work).
in -rt all the workqueue stuff is PI already, we replaced the list_head
in work_struct with a plist_node and queue_work() enqueues worklets at
the ->normal_prio of the calling task.
Hmm, the required barrier might still spoil - or at least complicate the
matter.
In order to boost the pending work, we'd need to enqueue a barrer before
the high prio worklet - otherwise the high prio worklet will complete
before the work we're waiting on.
And we can cancel the high prio worklet once all our worklets of
interrest are done, but we cannot cancel the barrier.
So it would leave a barrier behind.
It comes back to me, PI workqueues was comlicated..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists