[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100901133056.GB12447@midget.suse.cz>
Date: Wed, 1 Sep 2010 15:30:56 +0200
From: Jiri Bohac <jbohac@...e.cz>
To: Jarek Poplawski <jarkao2@...il.com>
Cc: Jay Vosburgh <fubar@...ibm.com>, Jiri Bohac <jbohac@...e.cz>,
bonding-devel@...ts.sourceforge.net, markine@...gle.com,
chavey@...gle.com, netdev@...r.kernel.org
Subject: Re: [RFC] bonding: fix workqueue re-arming races
On Wed, Sep 01, 2010 at 12:23:56PM +0000, Jarek Poplawski wrote:
> On 2010-08-31 22:54, Jay Vosburgh wrote:
> > What prevents this from deadlocking such that cpu A is in
> > bond_close, holding RTNL and in cancel_delayed_work_sync, while cpu B is
> > in the above function, trying to acquire RTNL?
>
> I guess this one isn't cancelled in bond_close, so it should be safe.
Nah, Jay was correct. Although this work item is not explicitly
cancelled with cancel_delayed_work_sync(), it is on the same
workqueue as work items that are being cancelled with
cancel_delayed_work_sync(), so this can still cause a deadlock.
Fixed in the new version of the patch by putting these on a
separate workqueue.
> > Also, assuming for the moment that the above isn't a problem,
> > curr_active_slave may be NULL if the last slave is removed between the
> > time bond_alb_promisc_disable is scheduled and when it runs. I'm not
> > sure that the alb_bond_info can be guaranteed to be valid, either, if
> > the mode changed.
>
> Probably some or all of these work functions should test for closing
> eg. with netif_running() or some other flag/variable under rtnl_lock().
>
> ....
> >> static void bond_work_cancel_all(struct bonding *bond)
> >> {
> >> - write_lock_bh(&bond->lock);
> >> - bond->kill_timers = 1;
> >> - write_unlock_bh(&bond->lock);
> >> -
> >> if (bond->params.miimon && delayed_work_pending(&bond->mii_work))
>
> I'd suggest to skip these delayed_work_pending() tests.
I agree, these should not be needed anymore, since bond_close()
now makes sure the re-arming work is cancelled. I'll update the
patch if Jay thinks it's otherwise good.
--
Jiri Bohac <jbohac@...e.cz>
SUSE Labs, SUSE CZ
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists