[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080328.180924.154907485.davem@davemloft.net>
Date: Fri, 28 Mar 2008 18:09:24 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: johannes@...solutions.net
Cc: davej@...emonkey.org.uk, netdev@...r.kernel.org
Subject: Re: 2.6.25rc7 lockdep trace
From: Johannes Berg <johannes@...solutions.net>
Date: Sat, 29 Mar 2008 02:01:25 +0100
>
> > > You can't flush a workqueue in the device close handler
> > > exactly because of this locking conflict.
> > >
> > > Nobody has come up with a suitable way to fix this yet.
> >
> > Maybe we should check which schedule_work users actually lock the rtnl
> > within the work function and move them to a uses-rtnl-in-work workqueue
> > so that everybody else can have rtnl around flush.
>
> On the other hand, most drivers don't actually care that their work has
> run, they just care that it won't run in the future after they give up
> resources or similar, hence they can and should use cancel_work_sync()
> which doesn't suffer from the deadlock. But that needs actual inspection
> because it does change behaviour from "run and wait for it if scheduled"
> to "cancel if scheduled".
I don't see how you can not race with the transition from
scheduled to "executing" without taking the runqueue lock
for the testing.
And it is crucial that the workqueue function doesn't
execute "accidently" due to such a race before the module
and thus the workqueue code is about to get potentially
unloaded.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists