[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DB4PR05MB0863B84F5AF4F8203CFDAC36B1EE0@DB4PR05MB0863.eurprd05.prod.outlook.com>
Date: Wed, 22 Apr 2015 16:16:40 +0000
From: Liran Liss <liranl@...lanox.com>
To: Michael Wang <yun.wang@...fitbricks.com>,
Roland Dreier <roland@...nel.org>,
Sean Hefty <sean.hefty@...el.com>,
Hal Rosenstock <hal.rosenstock@...il.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"hal@....mellanox.co.il" <hal@....mellanox.co.il>
CC: Tom Tucker <tom@...ngridcomputing.com>,
Steve Wise <swise@...ngridcomputing.com>,
Hoang-Nam Nguyen <hnguyen@...ibm.com>,
"raisch@...ibm.com" <raisch@...ibm.com>,
Mike Marciniszyn <infinipath@...el.com>,
Eli Cohen <eli@...lanox.com>,
Faisal Latif <faisal.latif@...el.com>,
Jack Morgenstein <jackm@....mellanox.co.il>,
"Or Gerlitz" <ogerlitz@...lanox.com>,
Haggai Eran <haggaie@...lanox.com>,
"Ira Weiny" <ira.weiny@...el.com>, Tom Talpey <tom@...pey.com>,
Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
Doug Ledford <dledford@...hat.com>
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> From: Michael Wang [mailto:yun.wang@...fitbricks.com]
>
> Hi, Liran
>
> Thanks for the comment :-)
>
> On 04/22/2015 01:36 AM, Liran Liss wrote:
> [snip]
> >
> > (**) This has been extended to also encode the transport in the current
> code.
> > At least for user-space visible APIs, we might chose to leave this for
> backward compatibility, but we can consider cleaning up the kernel code.
> >
> > So, I think that our "old-transport" below is just fine.
> > No need to change it (and you aren't, since it is currently implemented as a
> function).
> >
> > The "new-transport" does not really exist, but is broken into several
> capability checks of the L4 transport, optionally with conditions on the link
> type.
> > I would remove the table below and tell what we really want to achieve:
> > ==> move technology-specific feature-check logic out of the (multiple!) IB
> code components and various ULPs into per-feature helpers.
>
> Our purpose is to help core layer do management more clearly, rather then
> referring from transport and linklayer.
>
Right.
> IMHO from management's point of view, what we really care about is whether
> a particular management required by device or not, rather then the details on
> transport and link layer.
>
Depends on who is "we".
For ULPs, you are probably right.
However, core services (e.g., mad management, CM, SA) do care about various details.
In some cases, where it doesn't matter, this code will use management helpers.
In other cases, this code will inspect link, transport, and node attributes of rdma devices.
For example, the CM code has specific code paths for IB, RoCE, and iWARP.
There is no other CM code; there is no reason to abstract 'CM'. This code will have code
paths that depend on various specific details.
> This new transport is only understand by core-layer currently, for user-layer
> we still reserve the old transport for them, next step is to use bitmask instead
> of transport, at that time we can erase the new transport and make the
> whole stuff used by user-layer only :-)
>
I am not sure that we need a bit mask at all.
Your helpers already provide all the useful abstractions, which both core and ULPs call directly.
All the information is inferred directly from <link, transport, node> tuples.
Some of the user-space tools need *exactly* the same reasoning.
For example, management tools manage specific technologies and protocols, not some abstraction.
So, For user-space, we can think about exposing exactly the same helper framework, while providing
backward compatibility for the existing interfaces.
> >
> >
> > Detailed remarks
> > ==============
> >
> > 1) The introduction of cap_*_*() stuff should have been introduced directly
> in patch 02/27.
> > This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and
> increases the number of patches in the patch-set.
> > Do this and remove patches 16-24.
>
> We have some discussion about compress the patch set, merge the reform
> and introducing patch will mix the concept (like the earlier version), IMHO it
> will increase the difficulty of review...
>
> And now since many review already been done, it's not wise to change the
> whole structure of patch set IMHO...
>
I think it is because you are conditioning code on one thing, and then conditioning
the same code on another thing.
This is confusing.
Once we get our abstractions correct (i.e., the right helper functions), you replace the
existing logic with the suitable helper up-front.
> >
> > 2)The name rdma_tech_* is lame.
> > rdma_transport_*(), adhering to the above (*) remark, is much better.
> > For example, both IB and ROCE *do* use the same transport.
>
> We have some discussion on that too, use transport means going back...
>
No.
The existing notion of transport was correct. It was the node type that wasn't.
And in any case the new helpers didn't use it.
We need the original meaning of transport - see my response to Ira.
I propose replacing rdma_node_get_transport() with the following helpers:
- rdma_get_transport()
- rdma_is_ib_transport()
- rdma_is_iwarp_transport()
- ...
> >
> > 3) The name cap_* as it is used above is not accurate.
> > You use it to describe technology characteristics rather than extendable
> capabilities.
> > I would suggest having a single convention for all helpers, such as
> rdma_has_*() and rdma_is_*().
> > For example: cap_ib_smi() ==> rdma_has_smi().
>
> That means going back too...
See response to Ira (https://lkml.org/lkml/2015/4/21/951).
>
> >
> > 4) Remove all capabilities that do not introduce any distinction in the
> current code.
> > We can add them as needed later.
> > This means remove patches:
> > - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all
> > IB devices support ipoib
> > - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB
> devices support AF_IB.
> >
> > On the other hand:
> > - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
> > - cap_ib_sa() might make sense to cut code even further in the CMA, since
> RoCE has a GSI but no SA.
>
> We have discussion on define these helpers previously, again, name is not
> really a problem, I would rather to see such changes in the following series
> after this one working stably :-)
>
The names are not critical. This comment is about introducing helpers that are
do not introduce any new semantic notion in the current patch-set.
cap_ipoib(), for example, is brain-dead because only a single technology (as of now)
enables it: Infiniband.
> >
> > 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform
> > IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
>
> Actually nothing changed after the modify, the prev purpose it to eliminate
> the link layer helpers.
>
> But now we are not going to remove the helper any more, so let's drop this
> modification in next version :-)
>
You don't add modifications just to drop them later.
Don't add them in the first place!
This patch-set will remain forever in the kernel commit log - we want it to be
as self-explaining and coherent as possible.
Remove this.
> >
> > 6) Remove cap_read_multi_sge
> > It is not device/port feature, but a transport capability.
> > Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in
> 'enum ib_device_cap_flags'.
> >
> > 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
> cap_eth_ah().
> > Address handles that refer to Ethernet links always have Ethernet
> addressing.
> >
> > In the CMA code, using rdma_tech_iboe() is just fine. This is how you define
> cap_eth_ah() anyway.
> > Currently, this patch just adds clutter.
>
> There are also some discussion on these helpers, drop them means going
> back..
>
Back to where? Management helpers are a new concept. Let's get them right.
> The tech helper is not enough to explain the management purpose, and this
> can be the wrapper for bitmask stuff too.
>
As I said, I am not sure that we will need any bitmasks.
Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).
> >
> > 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
> > We do need a transport qualifier, as exemplified in comment 5) above, and
> for a complete clean model.
> > This is after renaming the function to rdma_is_ib_transport()...
>
> This means going back again... rdma_is_ib_transport() has been used
> previously.
>
> This helper is just to make the review more easier, we won't need it
> internally, not to mention after bitmask was introduced :-)
>
The same...
> >
> >
> > Putting it all together
> > ==================
> >
> > We are left with the following helpers:
> > - rdma_is_ib_transport()
> > - rdma_is_iwarp_transport()
> > - rdma_is_usnic_transport()
> > - rdma_is_iboe()
> > - rdma_has_mad()
> > - rdma_has_smi()
> > - rdma_has_gsi() - complements smi; can be used by the mad code for
> > clarity
> > - rdma_has_sa()
> > - rdma_has_cm()
> > - rdma_has_mcast()
>
> I think we can put the discussion on name and new helpers in future,
> currently let's focus on these basic reform and make them working stably ;-)
It's not just the names, it's their semantics.
Any problems with the names proposed above?
>
> Regards,
> Michael Wang
>
> >
> >
> >> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> >>
> >>
> >> Since v4:
> >> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> >> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> >> * Fix logical issue inside 3#, 14#
> >> * Refine 3#, 4#, 5# with label 'free'
> >> * Rework 10# to stop using port 1 when port already assigned
> >>
> >> There are plenty of lengthy code to check the transport type of IB
> >> device, or the link layer type of it's port, but actually we are just
> >> speculating whether a particular management/feature is supported by the
> device/port.
> >>
> >> Thus instead of inferring, we should have our own mechanism for IB
> >> management capability/protocol/feature checking, several proposals
> below.
> >>
> >> This patch set will reform the method of getting transport type, we
> >> will now using query_transport() instead of inferring from transport
> >> and link layer respectively, also we defined the new transport type
> >> to make the concept more reasonable.
> >>
> >> Mapping List:
> >> node-type link-layer old-transport new-transport
> >> nes RNIC ETH IWARP IWARP
> >> amso1100 RNIC ETH IWARP IWARP
> >> cxgb3 RNIC ETH IWARP IWARP
> >> cxgb4 RNIC ETH IWARP IWARP
> >> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> >> ocrdma IB_CA ETH IB IBOE
> >> mlx4 IB_CA IB/ETH IB IB/IBOE
> >> mlx5 IB_CA IB IB IB
> >> ehca IB_CA IB IB IB
> >> ipath IB_CA IB IB IB
> >> mthca IB_CA IB IB IB
> >> qib IB_CA IB IB IB
> >>
> >> For example:
> >> if (transport == IB) && (link-layer == ETH) will now become:
> >> if (query_transport() == IBOE)
> >>
> >> Thus we will be able to get rid of the respective transport and
> >> link-layer checking, and it will help us to add new
> >> protocol/Technology (like OPA) more easier, also with the introduced
> >> management helpers, IB management logical will be more clear and easier
> for extending.
> >>
> >> Highlights:
> >> The patch set covered a wide range of IB stuff, thus for those who are
> >> familiar with the particular part, your suggestion would be
> >> invaluable ;-)
> >>
> >> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> >> management helpers, 26#~27# do clean up.
> >>
> >> Patches haven't been tested yet, we appreciate if any one who have
> these
> >> HW willing to provide his Tested-by :-)
> >>
> >> Doug suggested the bitmask mechanism:
> >> https://www.mail-archive.com/linux-
> >> rdma@...r.kernel.org/msg23765.html
> >> which could be the plan for future reforming, we prefer that to be
> another
> >> series which focus on semantic and performance.
> >>
> >> This patch-set is somewhat 'bloated' now and it may be a good timing
> for
> >> staging, I'd like to suggest we focus on improving existed helpers and
> push
> >> all the further reforms into next series ;-)
> >>
> >> Proposals:
> >> Sean:
> >> https://www.mail-archive.com/linux-
> >> rdma@...r.kernel.org/msg23339.html
> >> Doug:
> >> https://www.mail-archive.com/linux-
> >> rdma@...r.kernel.org/msg23418.html
> >> https://www.mail-archive.com/linux-
> >> rdma@...r.kernel.org/msg23765.html
> >> Jason:
> >> https://www.mail-archive.com/linux-
> >> rdma@...r.kernel.org/msg23425.html
> >>
> >> Michael Wang (27):
> >> IB/Verbs: Implement new callback query_transport()
> >> IB/Verbs: Implement raw management helpers
> >> IB/Verbs: Reform IB-core mad/agent/user_mad
> >> IB/Verbs: Reform IB-core cm
> >> IB/Verbs: Reform IB-core sa_query
> >> IB/Verbs: Reform IB-core multicast
> >> IB/Verbs: Reform IB-ulp ipoib
> >> IB/Verbs: Reform IB-ulp xprtrdma
> >> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> >> IB/Verbs: Reform cm related part in IB-core cma/ucm
> >> IB/Verbs: Reform route related part in IB-core cma
> >> IB/Verbs: Reform mcast related part in IB-core cma
> >> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> >> IB/Verbs: Reform cma_acquire_dev()
> >> IB/Verbs: Reform rest part in IB-core cma
> >> IB/Verbs: Use management helper cap_ib_mad()
> >> IB/Verbs: Use management helper cap_ib_smi()
> >> IB/Verbs: Use management helper cap_ib_cm()
> >> IB/Verbs: Use management helper cap_iw_cm()
> >> IB/Verbs: Use management helper cap_ib_sa()
> >> IB/Verbs: Use management helper cap_ib_mcast()
> >> IB/Verbs: Use management helper cap_ipoib()
> >> IB/Verbs: Use management helper cap_read_multi_sge()
> >> IB/Verbs: Use management helper cap_af_ib()
> >> IB/Verbs: Use management helper cap_eth_ah()
> >> IB/Verbs: Clean up rdma_ib_or_iboe()
> >> IB/Verbs: Cleanup rdma_node_get_transport()
> >>
> >> ---
> >> drivers/infiniband/core/agent.c | 4
> >> drivers/infiniband/core/cm.c | 26 +-
> >> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> >> drivers/infiniband/core/device.c | 1
> >> drivers/infiniband/core/mad.c | 51 ++--
> >> drivers/infiniband/core/multicast.c | 18 -
> >> drivers/infiniband/core/sa_query.c | 41 +--
> >> drivers/infiniband/core/sysfs.c | 8
> >> drivers/infiniband/core/ucm.c | 5
> >> drivers/infiniband/core/ucma.c | 27 --
> >> drivers/infiniband/core/user_mad.c | 32 +-
> >> drivers/infiniband/core/uverbs_cmd.c | 6
> >> drivers/infiniband/core/verbs.c | 33 --
> >> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> >> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> >> drivers/infiniband/hw/cxgb4/provider.c | 7
> >> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> >> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> >> drivers/infiniband/hw/ehca/ehca_main.c | 1
> >> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> >> drivers/infiniband/hw/mlx4/main.c | 10
> >> drivers/infiniband/hw/mlx5/main.c | 7
> >> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> >> drivers/infiniband/hw/nes/nes_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> >> drivers/infiniband/hw/qib/qib_verbs.c | 7
> >> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> >> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> >> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> >> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> >> 35 files changed, 584 insertions(+), 368 deletions(-)
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> >> in the body of a message to majordomo@...r.kernel.org More majordomo
> >> info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists