linux-kernel - Re: [PATCH] make-cancel_rearming_delayed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070508120221.GA968@tv-sign.ru>
Date:	Tue, 8 May 2007 16:02:21 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Jarek Poplawski <jarkao2@...pl>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	David Chinner <dgc@....com>,
	David Howells <dhowells@...hat.com>,
	Gautham Shenoy <ego@...ibm.com>, Ingo Molnar <mingo@...e.hu>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

On 05/08, Jarek Poplawski wrote:
>
> On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote:
> > On 05/07, Jarek Poplawski wrote:
> ...
> > I am not happy with the complication this patch adds, mostly
> > I hate this smb_wmb() in insert_work(). I have an idea how to
> > remove it later, but this needs another patch not related to
> > workqueue.c.
> 
> Probably I don't feel those barriers enough, but IMHO,
> there is something wrong

As I said, I hate this barrier (and new complications) too.

We can make the things a bit better:

	- introduce smp_mb__before_spin_lock() // noop on x86

	- then, after some minimal code changes, we have

		set_wq_data();
		smp_mb__before_spin_lock();
		spin_lock_irqsave(&cwq->lock);
		list_add();		
		spin_unlock_irqrestore();

> if they are needed here, with all this made per cpu.

CPU-hotplug can change CPU.

> > 
> > > BTW, I'm still not convinced all additions are needed:
> > > the "old" cancel_rearming_  doesn't care about checking
> > > or waiting on anything after del_timer positive.
> > 
> > It would be very strange to do wait_on_work() only in case
> > when del_timer() failed. This way we still need to do
> > cancel_work_sync() after cancel_rearming_delayed_work(),
> > but only when del_timer() failed, ugly. Note also that
> > wait_on_work() does not sleep if work->func() is not running.
> > 
> > Also, consider this callback:
> > 
> > 	void work_handler(struct work_struct *w)
> > 	{
> > 		struct delayed_work dw = container_of(...);
> > 
> > 		queue_delayed_work(dw, delay);
> > 
> > 		// <------------- cancel_rearming_delayed_work()
> > 
> > 		cancel_delayed_work(dw);
> > 		queue_delayed_work(dw, another_delay);
> > 	}
> > 
> > Yes, this is strange and ugly. But correct! The current version
> > (before this patch) can't cancel this delayed_work. The new
> > implementation works correctly. So I think it is far better to
> > do wait_on_work() unconditionally.
> 
> I think there are possible many curious, but correct things yet,
> e.g. handler, which fires timer or another work, which rearms
> this work... But maybe it would be resonable to try to separate
> new things which are really necessary, so IMHO, (practical)
> elimination of endless or excessive looping. If there is needed
> some more functionality, maybe there should be second version
> added e.g. cancel_rearming_delayed_work_sync(). Without this
> you risk, people would really think the old version was better,
> if you knew what you were doing. And the worst thing possible:
> they'll start to use their own, lighter solutions.

My goal is just rename cancel_rearming_delayed_work() to
cancel_delayed_work_sync(), because now it doesn't require that @dwork
re-arms itself, and it doesn't require to do cancel_work_sync() after.
cancel_rearming_delayed_work() will be obsolete, just like
cancel_rearming_delayed_workqueue().

I am strongly against adding many different variants of cancel_xxx().
With this patch the API is simple

	- cancel_delayed_work() stops the timer

	- cancel_rearming_delayed_work() stops the timer and work,
	  doesn't need cancel_work_sync/flush_workqueue

> I think, there is no reason, such function should need more
> than one loop of one cpu workqueue to cancel any "normal" work
> (or maybe two - if locking is spared), and it's really hard
> to understand, why anybody should  wait all those "not mine",
> so probably slow and buggy works in a queue, do their part of
> needless locking/sleeping/looping and let "my" excellent
> driver to be reloaded... And I think you are doing this now
> with some reserves (kind of "with one hand") - because the
> _PENDING flag is overloaded here and one flag still free.
> I really cannot see how sparing one, two or even three flag
> checks during a loop could be worth even one run_workqueue
> loop more without them.

When work->func() re-arms itself, both queue_work() and queue_delayed_work()
may change CPU if CPU_DOWN_xxx happens. Note that this is possible
even if we use freezer for cpu-hotplug, because the timer migrates to
another CPU anyway.

> Yesterday I've written something probably against my intention:
> so, if there is any CPU burning place, which could be fixed,
> it's definitely worth fixing.

Perhaps yes, but see below.

>                                I'm not sure what exactly place
> did you mean - if spinlocking in wait_on_work - maybe it's
> a sign this place isn't optimal too: it seems to me, this all
> inserting/waiting for completion could be done under existing
> locks e.g. in run_workqueue (maybe with some flag?).

Sorry, can't understand this. Inserting uses existing lock, namely
cwq->lock.

Just in case, I think wait_on_cpu_work() can check ->current_work without
cwq->lock, but I am not sure yet. We can remove unneeded "new |= " from
set_wq_data(), we can do a couple of other optimizations. However, there
are already _a lot_ of workqueue changes in -mm tree. We can do this later.
Right now my only concern is correctness.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/