[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130906231901.GB10419@sbohrermbp13-local.rgmadvisors.com>
Date: Fri, 6 Sep 2013 18:19:01 -0500
From: Shawn Bohrer <shawn.bohrer@...il.com>
To: Steve Wise <swise@...ngridcomputing.com>
Cc: roland@...estorage.com, Bart Van Assche <bvanassche@....org>,
Shawn Bohrer <sbohrer@...advisors.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Cong Wang <xiyou.wangcong@...il.com>, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, swise@...lsio.com
Subject: Re: rtnl_lock deadlock on 3.10
On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote:
> Roland, what do you think?
>
> As I've said, I think we should go ahead with using the rtnl lock in
> the core. Is there a complete patch available for review? looks
> like the original was a partial fix.
I guess I should realize that when no one jumps at fixing my issues
for me that they probably aren't simple to fix. The solution that
Cong proposed was to acquire rtnl_lock() before acquiring the
infiniband device_mutex, and his partial patch did that in
ib_register_client(). The problem is that you would also need to do
that in ib_unregister_client(), ib_register_device(), and
ib_unregister_device(), and that brings us back to the original
problem which was that cxgb3 was holding the rtnl_lock() when it
called ib_register_device(). Thus with the proposed fix I believe
cxgb3 would already be holding the rtnl_lock() and then call
ib_register_device() which would try to acquire the rtnl_lock() again
and deadlock for a different reason.
Actually how does this currently work? ib_register_device() calls
client->add() for each client in the list which should call
ipoib_add_one() which calls register_netdev(). Shouldn't that also
deadlock in the cxgb3 case?
Also while digging through this I think I see another bug which is
that ipoib_dev_cleanup() can be called from ipoib_add_port() but in
the current code ipoib_add_port() is not holding the rtnl_lock() which
appears to be a requirement of ipoib_dev_cleanup().
Sigh... I'm going to stop looking at this for now and hopefully
someone can propose a better solution to this issue.
Thanks,
Shawn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists