[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 09 Sep 2013 11:48:49 -0500
From: Steve Wise <swise@...ngridcomputing.com>
To: Shawn Bohrer <shawn.bohrer@...il.com>, roland@...estorage.com
CC: Bart Van Assche <bvanassche@....org>,
Shawn Bohrer <sbohrer@...advisors.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Cong Wang <xiyou.wangcong@...il.com>, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, swise@...lsio.com
Subject: Re: rtnl_lock deadlock on 3.10
On 9/6/2013 6:19 PM, Shawn Bohrer wrote:
> On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote:
>> Roland, what do you think?
>>
>> As I've said, I think we should go ahead with using the rtnl lock in
>> the core. Is there a complete patch available for review? looks
>> like the original was a partial fix.
> I guess I should realize that when no one jumps at fixing my issues
> for me that they probably aren't simple to fix. The solution that
> Cong proposed was to acquire rtnl_lock() before acquiring the
> infiniband device_mutex, and his partial patch did that in
> ib_register_client(). The problem is that you would also need to do
> that in ib_unregister_client(), ib_register_device(), and
> ib_unregister_device(), and that brings us back to the original
> problem which was that cxgb3 was holding the rtnl_lock() when it
> called ib_register_device(). Thus with the proposed fix I believe
> cxgb3 would already be holding the rtnl_lock() and then call
> ib_register_device() which would try to acquire the rtnl_lock() again
> and deadlock for a different reason.
>
> Actually how does this currently work? ib_register_device() calls
> client->add() for each client in the list which should call
> ipoib_add_one() which calls register_netdev(). Shouldn't that also
> deadlock in the cxgb3 case?
cxgb3 is an iWARP device and doesn't support IPoIB.
>
> Also while digging through this I think I see another bug which is
> that ipoib_dev_cleanup() can be called from ipoib_add_port() but in
> the current code ipoib_add_port() is not holding the rtnl_lock() which
> appears to be a requirement of ipoib_dev_cleanup().
>
> Sigh... I'm going to stop looking at this for now and hopefully
> someone can propose a better solution to this issue.
I can help with this, but I'm waiting for Roland to chime in.
Steve.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists