[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5524F6BD.30105@profitbricks.com>
Date: Wed, 08 Apr 2015 11:37:01 +0200
From: Michael Wang <yun.wang@...fitbricks.com>
To: "Hefty, Sean" <sean.hefty@...el.com>,
Roland Dreier <roland@...nel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: Hal Rosenstock <hal.rosenstock@...il.com>,
Tom Tucker <tom@...ngridcomputing.com>,
Steve Wise <swise@...ngridcomputing.com>,
Hoang-Nam Nguyen <hnguyen@...ibm.com>,
Christoph Raisch <raisch@...ibm.com>,
infinipath <infinipath@...el.com>, Eli Cohen <eli@...lanox.com>,
"Latif, Faisal" <faisal.latif@...el.com>,
Upinder Malhi <umalhi@...co.com>,
Trond Myklebust <trond.myklebust@...marydata.com>,
"J. Bruce Fields" <bfields@...ldses.org>,
"David S. Miller" <davem@...emloft.net>,
"Weiny, Ira" <ira.weiny@...el.com>,
PJ Waskiewicz <pj.waskiewicz@...idfire.com>,
"Nikolova, Tatyana E" <tatyana.e.nikolova@...el.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Jack Morgenstein <jackm@....mellanox.co.il>,
Haggai Eran <haggaie@...lanox.com>,
Ilya Nelkenbaum <ilyan@...lanox.com>,
Yann Droneaud <ydroneaud@...eya.com>,
Bart Van Assche <bvanassche@....org>,
Shachar Raindel <raindel@...lanox.com>,
Sagi Grimberg <sagig@...lanox.com>,
Devesh Sharma <devesh.sharma@...lex.com>,
Matan Barak <matanb@...lanox.com>,
Moni Shoua <monis@...lanox.com>, Jiri Kosina <jkosina@...e.cz>,
Selvin Xavier <selvin.xavier@...lex.com>,
Mitesh Ahuja <mitesh.ahuja@...lex.com>,
Li RongQing <roy.qing.li@...il.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
"Estrin, Alex" <alex.estrin@...el.com>,
Doug Ledford <dledford@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
Erez Shitrit <erezsh@...lanox.com>,
Tom Gundersen <teg@...m.no>,
Chuck Lever <chuck.lever@...cle.com>
Subject: Re: [PATCH v2 13/17] IB/Verbs: Reform cma/ucma with management helpers
Hi, Sean
Thanks for the review :-) cma is the most tough part during
reform, I really need some guide in here.
On 04/07/2015 11:36 PM, Hefty, Sean wrote:
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index d8a8ea7..c23f483 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -435,10 +435,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private
>> *id_priv)
>> pkey = ntohs(addr->sib_pkey);
>>
>> list_for_each_entry(cur_dev, &dev_list, list) {
>> - if (rdma_node_get_transport(cur_dev->device->node_type) !=
>> RDMA_TRANSPORT_IB)
>> - continue;
>> -
>> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
>> + if (!rdma_ib_mgmt(cur_dev->device, p))
>> + continue;
>
> This check wants to be something like is_af_ib_supported(). Checking for IB transport may actually be better than checking for IB management. I don't know if IBoE/RoCE devices support AF_IB.
The wrapper make sense, but do we have the guarantee that IBoE port won't
be used for AF_IB address? I just can't locate the place we filtered it out...
>
[snip]
>> - == IB_LINK_LAYER_ETHERNET) {
>> + /* Will this happen? */
>> + BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
>
> This shouldn't happen. The BUG_ON looks okay.
Got it :-)
>
>
>> + if (rdma_transport_iboe(id_priv->id.device, id_priv->id.port_num)) {
>> ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);
>>
>> if (ret)
>> @@ -700,8 +700,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private
>> *id_priv,
>> int ret;
>> u16 pkey;
>>
>> - if (rdma_port_get_link_layer(id_priv->id.device, id_priv-
>>> id.port_num) ==
>> - IB_LINK_LAYER_INFINIBAND)
>> + if (rdma_transport_ib(id_priv->id.device, id_priv->id.port_num))
>> pkey = ib_addr_get_pkey(dev_addr);
>> else
>> pkey = 0xffff;
>
> Check here should be against the link layer, not transport.
I guess the name confusing us again... what if use rdma_tech_ib() here?
it's the only tech using IB link layers, others are all ETH.
>
>
>> @@ -735,8 +734,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct
[snip]
>>
>> static void cma_cancel_route(struct rdma_id_private *id_priv)
>> {
>> - switch (rdma_port_get_link_layer(id_priv->id.device, id_priv-
>>> id.port_num)) {
>> - case IB_LINK_LAYER_INFINIBAND:
>> + if (rdma_transport_ib(id_priv->id.device, id_priv->id.port_num)) {
>
> The check should be cap_ib_sa()
Got it, will be in next version :-)
All the mcast/sa suggestion below will be applied too.
>
[snip]
>>
>> id_priv->id.route.addr.dev_addr.dev_type =
>> - (rdma_port_get_link_layer(cma_dev->device, p) ==
>> IB_LINK_LAYER_INFINIBAND) ?
>> + (rdma_transport_ib(cma_dev->device, p)) ?
>> ARPHRD_INFINIBAND : ARPHRD_ETHER;
>
> This wants the link layer, or maybe use cap_ipoib.
Is this related with ipoib only?
>
>
>>
>> rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
>> @@ -2536,18 +2508,15 @@ int rdma_listen(struct rdma_cm_id *id, int
>> backlog)
>>
>> id_priv->backlog = backlog;
>> if (id->device) {
>> - switch (rdma_node_get_transport(id->device->node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> + if (rdma_ib_mgmt(id->device, id->port_num)) {
>
> Want cap_ib_cm()
Will be in next version :-) and the other cap_ib_cm() suggestion too.
>
>
>> ret = cma_ib_listen(id_priv);
[snip]
>> @@ -3016,14 +2979,10 @@ int rdma_accept(struct rdma_cm_id *id, struct
>> rdma_conn_param *conn_param)
>> else
>> ret = cma_rep_recv(id_priv);
>> }
>> - break;
>> - case RDMA_TRANSPORT_IWARP:
>> + } else if (rdma_transport_iwarp(id->device, id->port_num))
>> ret = cma_accept_iw(id_priv, conn_param);
>
> If cap_ib_cm() is used in the places marked above, maybe add a cap_iw_cm() for the else conditions.
Sounds good, will be in next version :-)
Regards,
Michael Wang
>
>
>> - break;
>> - default:
>> + else
>> ret = -ENOSYS;
>> - break;
>> - }
>>
>> if (ret)
>> goto reject;
>> @@ -3067,8 +3026,7 @@ int rdma_reject(struct rdma_cm_id *id, const void
>> *private_data,
>> if (!id_priv->cm_id.ib)
>> return -EINVAL;
>>
>> - switch (rdma_node_get_transport(id->device->node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> + if (rdma_ib_mgmt(id->device, id->port_num)) {
>
> cap_ib_cm()
>
>
>> if (id->qp_type == IB_QPT_UD)
>> ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
>> private_data, private_data_len);
>> @@ -3076,15 +3034,11 @@ int rdma_reject(struct rdma_cm_id *id, const void
>> *private_data,
>> ret = ib_send_cm_rej(id_priv->cm_id.ib,
>> IB_CM_REJ_CONSUMER_DEFINED, NULL,
>> 0, private_data, private_data_len);
>> - break;
>> - case RDMA_TRANSPORT_IWARP:
>> + } else if (rdma_transport_iwarp(id->device, id->port_num)) {
>> ret = iw_cm_reject(id_priv->cm_id.iw,
>> private_data, private_data_len);
>> - break;
>> - default:
>> + } else
>> ret = -ENOSYS;
>> - break;
>> - }
>> return ret;
>> }
>> EXPORT_SYMBOL(rdma_reject);
>> @@ -3098,22 +3052,17 @@ int rdma_disconnect(struct rdma_cm_id *id)
>> if (!id_priv->cm_id.ib)
>> return -EINVAL;
>>
>> - switch (rdma_node_get_transport(id->device->node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> + if (rdma_ib_mgmt(id->device, id->port_num)) {
>> ret = cma_modify_qp_err(id_priv);
>> if (ret)
>> goto out;
>> /* Initiate or respond to a disconnect. */
>> if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
>> ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
>
> cap_ib_cm()
>
>
>> - break;
>> - case RDMA_TRANSPORT_IWARP:
>> + } else if (rdma_transport_iwarp(id->device, id->port_num)) {
>> ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
>> - break;
>> - default:
>> + } else
>> ret = -EINVAL;
>> - break;
>> - }
>> out:
>> return ret;
>> }
>> @@ -3359,24 +3308,13 @@ int rdma_join_multicast(struct rdma_cm_id *id,
>> struct sockaddr *addr,
>> list_add(&mc->list, &id_priv->mc_list);
>> spin_unlock(&id_priv->lock);
>>
>> - switch (rdma_node_get_transport(id->device->node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> - switch (rdma_port_get_link_layer(id->device, id->port_num)) {
>> - case IB_LINK_LAYER_INFINIBAND:
>> - ret = cma_join_ib_multicast(id_priv, mc);
>> - break;
>> - case IB_LINK_LAYER_ETHERNET:
>> - kref_init(&mc->mcref);
>> - ret = cma_iboe_join_multicast(id_priv, mc);
>> - break;
>> - default:
>> - ret = -EINVAL;
>> - }
>> - break;
>> - default:
>> + if (rdma_transport_iboe(id->device, id->port_num)) {
>> + kref_init(&mc->mcref);
>> + ret = cma_iboe_join_multicast(id_priv, mc);
>> + } else if (rdma_transport_ib(id->device, id->port_num))
>> + ret = cma_join_ib_multicast(id_priv, mc);
>
> cap_ib_mcast()
>
>
>> + else
>> ret = -ENOSYS;
>> - break;
>> - }
>>
>> if (ret) {
>> spin_lock_irq(&id_priv->lock);
>> @@ -3404,19 +3342,17 @@ void rdma_leave_multicast(struct rdma_cm_id *id,
>> struct sockaddr *addr)
>> ib_detach_mcast(id->qp,
>> &mc->multicast.ib->rec.mgid,
>> be16_to_cpu(mc->multicast.ib-
>>> rec.mlid));
>> - if (rdma_node_get_transport(id_priv->cma_dev->device-
>>> node_type) == RDMA_TRANSPORT_IB) {
>> - switch (rdma_port_get_link_layer(id->device, id-
>>> port_num)) {
>> - case IB_LINK_LAYER_INFINIBAND:
>> - ib_sa_free_multicast(mc->multicast.ib);
>> - kfree(mc);
>> - break;
>> - case IB_LINK_LAYER_ETHERNET:
>> - kref_put(&mc->mcref, release_mc);
>> - break;
>> - default:
>> - break;
>> - }
>> - }
>> +
>> + /* Will this happen? */
>> + BUG_ON(id_priv->cma_dev->device != id->device);
>
> Should not happen
>
>> +
>> + if (rdma_transport_ib(id->device, id->port_num)) {
>> + ib_sa_free_multicast(mc->multicast.ib);
>> + kfree(mc);
>
> cap_ib_mcast()
>
>
>> + } else if (rdma_transport_iboe(id->device,
>> + id->port_num))
>> + kref_put(&mc->mcref, release_mc);
>> +
>> return;
>> }
>> }
>> diff --git a/drivers/infiniband/core/ucma.c
>> b/drivers/infiniband/core/ucma.c
>> index 45d67e9..42c9bf6 100644
>> --- a/drivers/infiniband/core/ucma.c
>> +++ b/drivers/infiniband/core/ucma.c
>> @@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file
>> *file,
>>
>> resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
>> resp.port_num = ctx->cm_id->port_num;
>> - switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> - switch (rdma_port_get_link_layer(ctx->cm_id->device,
>> - ctx->cm_id->port_num)) {
>> - case IB_LINK_LAYER_INFINIBAND:
>> - ucma_copy_ib_route(&resp, &ctx->cm_id->route);
>> - break;
>> - case IB_LINK_LAYER_ETHERNET:
>> - ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
>> - break;
>> - default:
>> - break;
>> - }
>> - break;
>> - case RDMA_TRANSPORT_IWARP:
>> +
>> + if (rdma_transport_ib(ctx->cm_id->device, ctx->cm_id->port_num))
>> + ucma_copy_ib_route(&resp, &ctx->cm_id->route);
>
> cap_ib_sa()
>
>
>> + else if (rdma_transport_iboe(ctx->cm_id->device, ctx->cm_id-
>>> port_num))
>> + ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
>> + else if (rdma_transport_iwarp(ctx->cm_id->device, ctx->cm_id-
>>> port_num))
>> ucma_copy_iw_route(&resp, &ctx->cm_id->route);
>> - break;
>> - default:
>> - break;
>> - }
>>
>> out:
>> if (copy_to_user((void __user *)(unsigned long)cmd.response,
>
>
> - Sean
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists