[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55A522B0.6050207@mellanox.com>
Date: Tue, 14 Jul 2015 17:54:40 +0300
From: Haggai Eran <haggaie@...lanox.com>
To: Jason Gunthorpe <jgunthorpe@...idianresearch.com>
CC: Doug Ledford <dledford@...hat.com>, <linux-rdma@...r.kernel.org>,
<netdev@...r.kernel.org>, Liran Liss <liranl@...lanox.com>,
Guy Shapiro <guysh@...lanox.com>,
Shachar Raindel <raindel@...lanox.com>,
Yotam Kenneth <yotamke@...lanox.com>
Subject: Re: [PATCH v1 01/12] IB/core: pass client data to remove() callbacks
On 09/07/2015 00:34, Jason Gunthorpe wrote:
> On Wed, Jul 08, 2015 at 02:29:10PM -0600, Jason Gunthorpe wrote:
>> On Mon, Jun 22, 2015 at 03:42:30PM +0300, Haggai Eran wrote:
>>> An ib_client callback that is called with the lists_rwsem locked only for
>>> read is protected from changes to the IB client lists, but not from
>>> ib_unregister_device() freeing its client data. This is because
>>> ib_unregister_device() will remove the device from the device list with
>>> lists_rwsem locked for write, but perform the rest of the cleanup,
>>> including the call to remove() without that lock.
>>
>> I was going to look at this, but, uh.. it seems mangled, doesn't
>> apply, doesn't seem fixable from here.
>
> Okay, I see, it sits on top of the patch from Matan's last
> posting.. My bad.
No problem.
> Hum... I have to say I don't really like this, changing the ordering
> of client_data = NULL with respect to client->remove doesn't seem like
> a great idea - and the rds changes look scary to me, at least I
> couldn't confidently say they were OK..
>
> And that isn't really the issue - this has nothing to do with
> client_data, it is all about not having a callback running when doing
> remove.
>
> It looks like the way out of this is to have ib_get_net_dev_by_params
> iterate over the client_data_list and use a dedicated flag in that
> struct to indicate that client&device combination is
> remove-in-progress.
>
> This would be a bit more efficient as well, and I would suggest
> passing the context in as an arg to the callback.
>
> client_data_list would change a bit to become write locked first by
> write(lists_rwsem), and then second by the spin lock, so holding
> read(lists_rwsem) while iterating is enough locking, and you'd hold
> lists_rwsem while kfreeing.
So, I don't want to keep lists_rwsem for write while calling add() and
remove(). This would cause the deadlock that required the lists_rwsem
patch in the first place. I guess I can drop lists_rwsem before the
add/remove call and take it afterwards.
Haggai
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists