[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJ3xEMhzv-CW0XdENn9koScBcxrhXDqEd-Es8=b6edzmHCZrkQ@mail.gmail.com>
Date: Mon, 7 May 2018 20:29:25 +0300
From: Or Gerlitz <gerlitz.or@...il.com>
To: Alex Vesker <valex@...lanox.com>,
Denis Drozdov <denisd@...lanox.com>
Cc: David Miller <davem@...emloft.net>,
Doug Ledford <dledford@...hat.com>,
RDMA mailing list <linux-rdma@...r.kernel.org>,
Linux Netdev List <netdev@...r.kernel.org>,
Jason Gunthorpe <jgg@...lanox.com>,
Don Dutile <ddutile@...hat.com>,
Honggang Li <honli@...hat.com>, Noa Spanier <noas@...lanox.com>
Subject: Re: [PATCH v2 net 0/2] IB/ipoib: ip link support
On Thu, Jan 18, 2018 at 7:25 AM, Jason Gunthorpe <jgg@...lanox.com> wrote:
> On Mon, Jan 15, 2018 at 01:13:09PM -0500, David Miller wrote:
>> From: Denis Drozdov <denisd@...lanox.com>
>> Date: Tue, 9 Jan 2018 23:42:45 +0200
>>
>> > IP link was broken due to the changes in IPoIB for the rdma_netdev
>> > support after commit cd565b4b51e5
>> > ("IB/IPoIB: Support acceleration options callbacks").
>> >
>> > This patchset restores IPoIB pkey creation and removal using rtnetlink.
>> > The first patch introduces changes in the rtnetlink code in order to allow
>> > IPOIB allocate and free the netdevice.
>> >
>> > The second patch establishes appropriate rtnetlink callbacks for IPoIB
>> > device and restores IPoIB netlink support
>> >
>> > Changes since v1:
>> > - Fixed double free_netdev calls in ops->free_link in netdev_run_todo
>> > - Removed priv_size from ipoib_link_ops as not required anymore.
>>
>> Please fix your control flow so that the existing netlink op can do
>> the right thing.
>
> Okay. Since this bug has been outstanding in RDMA for so long, I've
> looked pretty carefully at this..
>
> This patch does go too far, but ipoib does appear to legitimately need
> something special here and no amount of 'control flow fixing' in ipoib
> will solve it.
>
> The fundamental issue is that ipoib no longer has a generic software
> netdev for the child interface. Instead the child interface is bound
> directly to one of several *full* hardware drivers. ie the child and
> the master are basically exactly the same HW dependent code.
>
> Since the child is a full hardware netdev, it has its own HW driven
> needs for priv_size, txqs, rxqs, etc. There is no 'one size fits all'
> value like for all the other software based newlink users. It depends
> on the HW device installed in the system (eg mlx4, mlx5, hfi1)
>
> The only problem with rtnl newlink is that it wants to use priv_size,
> txqs, and rxqs values from a singleton static global structure (eg
> ipoib_link_ops) which cannot know in advance which ipoib HW driver is
> in use by the specific parent the child is being created for.
>
> This is what needs to be fixed. ipoib simply needs to have
> parent-dependent inputs to the alloc_netdev_mqs in rtnl_create_link.
>
> The patch proposes
>
> struct net_device *(*alloc_link)(struct net *src_net,
> const char *dev_name,
> struct nlattr *tb[]);
>
> To let ipoib allocate the netdev, and then obviosly figure out the
> right parameters internally.
>
> This patch also adds some free related stuff, but that seems to be
> going too far. ipoib can use the normal free paths. The above is all
> that is necessary.
>
> An alternative would be something like:
>
> struct rtnl_alloc_params {
> size_t priv_size;
> int txqs;
> int rxqs;
> };
> void (*get_alloc_parameters)(struct net_device *net,
> struct rtnl_alloc_params *parms);
>
> To let ipoib tell rtnl_create_link what parameters to use based on
> 'net' instead of taking them from the global singleton.
>
> Do you see another choice?
>
> Can you accept one of these options?
>
> Thanks,
> Jason
> (BTW, Doug & I are now co-maintaining RDMA, so I say the above with my
> maintainer hat on. This is a serious userspace visible regression bug
> on our side, and I really want to see it fixed.)
Guys,
This is broken upstream for a long time, and now in few distributions
that picked
the patches that introduce the breakage, what is the plan?
Or.
Powered by blists - more mailing lists