[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070420085354.GP32602149@melbourne.sgi.com>
Date: Fri, 20 Apr 2007 18:53:54 +1000
From: David Chinner <dgc@....com>
To: Jarek Poplawski <jarkao2@...pl>
Cc: David Chinner <dgc@....com>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work
On Fri, Apr 20, 2007 at 10:13:26AM +0200, Jarek Poplawski wrote:
> On Fri, Apr 20, 2007 at 12:46:18AM +1000, David Chinner wrote:
> > On Thu, Apr 19, 2007 at 08:54:04AM +0200, Jarek Poplawski wrote:
> > > Hi,
> > >
> > > IMHO cancel_rearming_delayed_work is dangerous place:
> >
> > Agreed - I spent a couple of hours today learning why it
> > can only be used on work functions that always rearm...
> >
> > > - it assumes a work function always rearms (with no exception),
> > > which probably isn't explained enough now (but anyway should
> > > be checked in such loops);
> > >
> > > - probably possible (theoretical) scenario: a few work
> > > functions rearm themselves with very short, equal times;
> > > before flush_workqueue ends, their timers are already
> > > fired, so cancel_delayed_work has nothing to do.
> >
> > Easier than that - have a work function that rearms only if there's
> > more work to do in the future. You only arm the timer when you
> > have work to do, and it only rearms if there's more work to
> > do in the future (e.g. rotating expiry lists).
> >
> > i.e. while there's more work to do, you need to call
> > cancel_rearming_delayed_work() to stop it reliably, but if you race
> > with the work function not restarting itself, you hang.....
>
> I'm not sure I correctly get your point, but according to
> this comment:
>
> " * cancel_rearming_delayed_work - reliably kill off a delayed
> keventd work whose handler rearms the delayed work."
>
> there is a question, whether a function that "rearms only if"
> - "rearms".
Right. Given that the bug I was initially trying to solve was a race
killing off a handler that was rearming itself, that comment says to
me "this is the right thing to do".
> It seems the author of this comment didn't think
> so and it was obvious to him/her cancel_rearming_delayed_work
> wasn't intended for this case. At first I thought it's only a
> language question - now, I see it's probably logical, too.
Yes, after spending another two hours working out why my fix was
then hanging in cancel_rearming_delayed_work() I was a little bit
annoyed at the now obviously misleading comment. Five minutes later
I'd fixed the bug properly. A better comment would have saved me two
hours of wasted time.....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists