linux-kernel - Re: [PATCH] workqueues: insert_work: use "list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080613151725.GA9194@tv-sign.ru>
Date:	Fri, 13 Jun 2008 19:17:25 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jarek Poplawski <jarkao2@...pl>,
	Max Krasnyansky <maxk@...lcomm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] workqueues: insert_work: use "list_head *" instead of "int tail"

On 06/13, Peter Zijlstra wrote:
>
> On Fri, 2008-06-13 at 18:26 +0400, Oleg Nesterov wrote:
> > 
> > Yes, we have a free bit... but afaics we can do better.
> > 
> > 	struct context_barrier {
> > 		struct work_struct   work;
> > 		struct flush_context *fc;
> > 		...
> > 	}
> > 
> > 	void context_barrier_barrier_func(struct work_struct *work)
> > 	{
> > 		struct flush_context *fc = container_of();
> > 		if (atomic_dec_and_test())
> > 			...
> > 	}
> > 
> > 	void insert_context_barrier(work, barr)
> > 	{
> > 		...insert barr after work, like flush_work() does...
> > 	}
> > 
> > 	queue_work_contex(struct workqueue_struct *wq,
> > 			  struct work_struct *work,
> > 			  struct flush_context *fc)
> > 	{
> > 		int ret = queue_work(wq, work);
> > 		if (ret)
> > 			insert_context_barrier(work, barr);
> > 		return ret;
> > 	}
> > 
> > this way we shouldn't change run_workqueue() and introduce a "parallel"
> > larger work_struct which needs its own INIT_()/etc.
> 
> Where does do the context_barrier instances come from, are they
> allocated in insert_context_barrier() ?

Ah, indeed. We have to kmalloc() or use a larger work_struct... And if we
use a larger work_struct, we can make

	struct work_struct_context
	{
		struct work_struct work;
		work_func_t real_func;
		struct flush_context *fc;
	};

	static void work_context_func(struct work_struct *work)
	{
		struct work_struct_context *wc = container_of();
		struct flush_context *fc = wc->context;

		wc->real_func(work);

		if (atomic_dec_and_test(fc->count))
			...
	}

to avoid changing run_workqueue(), but this is nasty. Perhaps flush_context
can be just array or list of queued work_structs, then we can do flush_work()
for each work_struct...

> > > making all this PI savvy for -rt is going to be fun though.. I guess we
> > > can just queue a normal barrier of the flusher's priority, and cancel it
> > > once we complete.. hey - that doesn't sound hard at all :-)
> >
> > Yes!!! I think this is much better (because _much_ simple) than re-ordering
> > the pending work_struct's, we can just boost the whole ->worklist. We can
> > implement flush_work_pi() in the same manner as queue_work_contex() above.
> > That is why I said previously that flush_() should govern the priority,
> > not queue.
> >
> > But we can also implement queue_work_pi(struct work_struct_pi *work).
>
> in -rt all the workqueue stuff is PI already, we replaced the list_head
> in work_struct with a plist_node and queue_work() enqueues worklets at
> the ->normal_prio of the calling task.

Oh well. IIRC, these patches really uglified^Wcompicated the code. Don't
get me wrong, if I knew how to make this better, I'd sent the patch ;)

> Hmm, the required barrier might still spoil - or at least complicate the
> matter.
>
> In order to boost the pending work, we'd need to enqueue a barrer before
> the high prio worklet - otherwise the high prio worklet will complete
> before the work we're waiting on.

flush_work_pi() can boost the priority, and then barrier restores it.
Can't this work?

> And we can cancel the high prio worklet once all our worklets of
> interrest are done, but we cannot cancel the barrier.

Yes, if we cancel the high prio worklet, this doesn't restore the prio
immediately.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/